Development, evaluation, and In-vitro assessment of artificial intelligence antidiabetic predictive models from α-glucosidase inhibitors
No Thumbnail Available
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of the Western Cape
Abstract
Background: The global rise in the prevalence of diabetes mellitus (DM) poses a significant health challenge, necessitating effective therapeutic interventions. Type 2 DM (T2DM) remains the most prevalent form of diabetes, accounting for over 90% of cases. Intestinal α-glucosidase enzymes, located at the small intestine's brush border, are important treatment targets in T2DM as they are involved in the terminal point of carbohydrate digestion in the gut, liberating glucose molecules, which are then transported into the bloodstream. Controlled inhibition of these enzymes is pivotal for managing postprandial hyperglycemia, which contributes to vascular complications in T2DM. The localized action of α-glucosidase inhibitors at the brush border eliminates the need for systemic absorption, directly mitigating postprandial hyperglycemic spikes. Acarbose and Miglitol are FDA-approved drugs that target and inhibit α-glucosidase. Despite their benefits, these drugs face some drawbacks. For example, acarbose has been shown to have many side effects and undergo gastrointestinal degradation, which may limit its efficacy. Conversely, Miglitol is less localized to the gastrointestinal tract due to its rapid absorption and high bioavailability, potentially diminishing its efficacy. These drawbacks underscore the need to search for viable alternatives. Traditional drug discovery approaches have primarily relied on empirical compound screening, which has been historically successful but is laborious, time-consuming, and expensive. In recent years, many strategies have been applied to drug discovery to mitigate the time and cost required as well as increase the chance of success. One such strategy is introducing artificial intelligence (AI) in drug discovery. The integration of machine learning (ML) and deep learning (DL) techniques, which are aspects of AI, into drug discovery has emerged as an advancement in Quantitative Structure-Activity Relationship (QSAR) approaches for drug discovery. The QSAR approach can analyze vast chemical datasets and potentially uncover novel α-glucosidase inhibitors more rapidly and cost-effectively than traditional methods. Therefore, this project set out to analyze the chemical space of existing α-glucosidase inhibitors that has experimentally determined IC50, develop ML and DL models for predicting α-glucosidase inhibitors, utilize these models for virtual screening to identify potential α-glucosidase inhibitors, and perform in vitro assays on the identified hits to validate predictions.
Method: The study begins with the assembling and preparation of a library of α-glucosidase inhibitors with experimentally determined IC50. The prepared data was categorized as active and inactive to enable machine learning classification tasks. A stringent IC50 threshold of ≤ 2.04 μM was used to label compounds active and higher IC50 compounds were labelled inactive. An extensive exploration of the chemical space of the prepared data was carried out to understand the properties influencing their reported bioactivity. Random Forest (RF) and Support Vector Machine (SVM) models were developed using 2D and 3D molecular descriptors and extended-connectivity fingerprints (ECFP). Additionally, state-of-the-art deep learning models were created using Graph Neural Networks (GNNs) architectures, including Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), Graph Isomorphism Networks (GIN), and Attentive Fingerprints (AFP). The GNNs work directly with molecular graphs, where atoms are nodes and bonds are edges, capturing detailed structural and relational information within molecules, making them effective for modelling complex biochemical interactions. These models were rigorously analyzed for performance across multiple metrics to identify the most effective techniques for predicting potent inhibitors. The top-performing models were then used for virtual screening against the DrugBank and Coconut databases. Promising candidates identified from the screening were subsequently subjected to in-vitro validation to confirm their inhibitory activity against α-glucosidase and evaluate their therapeutic potential.
Results: The findings of the chemical space exploration underscored the importance of properties such as molecular weight, cLogP, hydrogen bonding, and molecular rigidity in distinguishing compounds’ activities as defined by the set IC50 threshold. Compounds with an IC50 ≤ 2.04 μM are generally characterized by higher hydrophilicity, lower clogP, more negative clogS, greater polarity, and higher hydrogen bonding capacity, and vice versa. The RF and SVM models trained on the descriptors exhibited robust evaluation metrics performance, with 2D descriptors and ECFP4 molecular representations outperforming 3D descriptors. Virtual screening of the DrugBank database using these models identified potential α-glucosidase inhibitor drugs, demonstrating the potential of ML models in drug repurposing efforts. However, the RF model with ECFP4 fingerprints was highly conservative, predicting 0.3% of virtual screening DrugBank data to be active. In contrast, GNNs models showed less conservative active prediction rate of 5.3% on its Coconut database virtual screening predictions.
In-vitro assessment revealed that the compounds predicted in virtual screening by the RF model did not show measurable IC50 values, highlighting a limitation in the corroboration of its predictive capability with experimental validation. Conversely, the in-vitro assessment of selected compounds predicted by the GAT model showed better conformation with computational predictions, identifying compounds with measurable IC50 values as active, though these compounds did not satisfactorily meet the stringent threshold we have set for active classification (IC50 ≤ 2.04 μM).
Conclusion: The chemical space exploration of the compound library with in vitro inhibitory activities against intestinal α-glucosidase revealed the importance of polarity and hydrogen bonding capacity in potent inhibitors of α-glucosidase. The top performing ML (RF) and DL (GAT) models built on the structural data of the compound library were used for virtual screening of DrugBank database and Coconut database of natural compounds, respectively, to rapidly mine potential α-glucosidase inhibitors. In vitro assessment of hit compounds showed that GAT model predictions have better alignment with empirical IC50 data.
Description
Keywords
Diabetes, α-glucosidase, QSAR, artificial intelligence, machine learning