Description

This toolkit introduces a machine learning pipeline designed for small drug screening. It includes:

A Jupyter Notebook for data parsing and preprocessing, model evaluation, optimization, and saving.
An Object-Oriented Programming (OOP) script (ml_launcher.py) that uses the trained estimator to make predictions locally on the user’s computer. Users can also modify the notebook to run predictions within a Google Colab environment if desired.

The model is a binary classifier that predicts bioactivity, returning:

1 for active compounds (Ki ≤ 50 nM).
0 for inactive compounds (Ki > 50 nM).

The example dataset focuses on the 5-HT7 receptor, which demonstrates the workflow but can be adapted to predict bioactivity for any biological target. Notably, the 5-HT7 dataset includes a relatively small number of molecules (177 after preprocessing) and exhibits some class imbalance (active vs. inactive). Consequently, predictions may be suboptimal. Thus, this toolkit serves as an example of a fingerprints-based binary classifier. For other biological targets, different models might perform better. Therefore, it’s crucial to evaluate and select the most suitable estimator for each target.

Instructions

To run the ml_launcher.py script, ensure the following files are in the same directory:

ml_launcher.py (script for making predictions).
best_rfc_model.joblib (the trained estimator).
example.csv (the file containing molecules to be analyzed).

Requirements Installation

If necessary, install the required Python packages using the following command:

pip install pandas numpy rdkit joblib

Once the required packages are installed, you can run the script in your local environment.

List of Files

5ht7_IC50.csv - raw dataset from ChEMBL.
5ht7_Ki.csv - raw dataset from ChEMBL.
ml_notebook.ipynb - Jupyter Notebook containing the model definition and training pipeline.
best_rfc_model.joblib - trained and saved estimator.
example.csv - example file with molecules for prediction.
ml_launcher.py - python script for running predictions.
fps_esti_schema.png - scheme.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Instructions

Requirements Installation

List of Files

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
predictions		predictions
5ht7_IC50.csv		5ht7_IC50.csv
5ht7_Ki.csv		5ht7_Ki.csv
README.md		README.md
ml_notebook.ipynb		ml_notebook.ipynb

Adam-maz/Fingerprints-based_tool_for_small_drug_screening

Folders and files

Latest commit

History

Repository files navigation

Description

Instructions

Requirements Installation

List of Files

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages