Publication:
IDENTIFICATION OF NOVEL DRUG-LIKE MATRIX METALLOPROTEINASE-2 (MMP-2) INHIBITORS THROUGH CONSENSUS MOLECULAR FINGERPRINT MACHINE LEARNING.

Research Projects
Organizational Units
Journal Issue
Abstract
MMP-2 is an important proteolytic enzyme that controls the degradation of extracellular matrix (ECM), inflammation, cancer, cardiovascular diseases, etc. Applying the machine learning technique with quantitative-structure activity relationship (QSAR) modelling could overcome the challenges of classical QSAR modelling, which derives bioactivity through linear regression, elucidating linear relationships from single analogue series. Konstanz Information Miner (KNIME), a free and open-source data analytics platform, was used to develop machine learning models. The bioactivity dataset of MMP-2 inhibitors was downloaded from ChEMBL and PubChem. The ChemBL database was used as a training dataset to develop machine learning models. A total of 15 machine learning models were produced with five algorithms; 1) Random Forest, 2) Extreme Gradient Boosting (XGBoost), 3) Naïve Bayes, 4) Support Vector Machine (SVM), and 5) Probabilistic Neural Network (PNN) using 3 molecular structure fingerprints; 1) Molecular access system (MACCS), 2) Feat Morgan, and 3) Atom Pair. The PubChem dataset was used for external validation of machine learning models. The commercially available compound libraries from Enamine, USA, were used as test datasets for shortlisting the novel MMP-2 inhibitors. The shortlisted MMP-2 inhibitors were further shortlisted using molecular docking studies with Glide - Schrödinger (Schrödinger small molecule drugdiscovery suite version 2022-1). The shortlisted compounds were purchased from Enamine and tested their MMP-2 inhibitory activity by determining RNA and protein expression in human neuroblastoma (SH-SY 5Y) cells. The MMP-2 RNA expression was quantified using quantitative reverse transcription polymerase chain reaction (RT-qPCR) and protein expression was determined using ELISA. The molecular fingerprint, MACCS, showed the best predictive performance in all the ML models, followed by Atom Pair and Feat Morgan. The Random Forest algorithm had shown the best classification (active vs inactive) capability with the highest scores across all performance metrics. The classification performance of Naïve Bayes, XGBoost and Probabilistic Neural Network were equal. The Support Vector Machine showed the least classification capability. The identified hits from Enamine commercial database were shown favourable binding (negative free energy) in molecular docking studies. The shortlisted hits inhibited the MMP-2 gene and protein expression in human neuroblastoma cells. Machine learning methods were successfully deployed to identify the novel potent MMP-2 inhibitors.
Description
Keywords
Peptide Hydrolases, Molecular Docking Simulation, Machine Learning, Neuroblastoma
Citation
Click for Full-View