Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Post Drug repurposing analysis utilizing ADME screening, toxicity predictions with Tox21, and QSAR modeling to evaluate absorption, distribution, metabolism, and excretion profiles, predict toxicities, and assess molecular activity. This streamlined workflow accelerates drug discovery while ensuring safety and efficacy.

Notifications You must be signed in to change notification settings

hossein-noorollahi/PDRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation


Post Drug Repurposing Analysis (PDRA)

This repository focuses on a comprehensive Post Drug Repurposing Analysis workflow. The project aims to evaluate candidate drugs identified through repurposing efforts by analyzing their Absorption, Distribution, Metabolism, and Excretion (ADME) profiles, predicting toxicities using the Tox21 dataset, and assessing molecular activity through Quantitative Structure-Activity Relationship (QSAR) modeling. This streamlined approach is designed to accelerate drug discovery while ensuring both safety and efficacy.

Table of Contents

Project Overview

Drug repurposing is an efficient strategy to identify new therapeutic uses for existing drugs. This project provides a robust framework for the post-analysis of drug repurposing candidates. It encompasses crucial stages such as ADME screening, toxicity prediction, and QSAR modeling, offering a holistic evaluation of drug candidates. The ultimate goal is to generate actionable insights that can expedite the drug discovery pipeline.

Current Progress

At this preliminary stage, the project focuses on preparing molecular data for subsequent pharmacokinetic and toxicity evaluations. This includes:

  1. Filtering initial drug repurposing results based on specific criteria.
  2. Retrieving essential chemical identifiers (CID) and structural representations (SMILES) from the PubChem database.
  3. Formatting data for compatibility with external ADME prediction tools like SwissADME.
  4. Initial work on clustering of results and identification of significant drug groups using unsupervised learning methods has also been addressed.

The complete codes and explanations for the remaining, more advanced steps are currently under development and will be added soon.

Implemented Steps

Step 1: Filtering Drug Repurposing Results

This step processes raw drug repurposing data to refine the list of candidates based on predefined criteria.

  • Input: export.csv (a CSV file containing initial drug repurposing results, including Rank, Score, Type, ID, Name, Description).
  • Process:
    • Reads the export.csv file.
    • Filters rows based on Type (e.g., 'cp', 'kd', 'oe', 'cc') and Score values greater than 90.
  • Output: Four distinct CSV files, each containing filtered data for a specific drug type with scores exceeding 90:
    • filtered_cp_above_90.csv
    • filtered_kd_above_90.csv
    • filtered_oe_above_90.csv
    • filtered_cc_above_90.csv

Step 2: Fetching CID and SMILES for SwissADME

This step integrates with the PubChem API to enrich the filtered drug candidates with chemical identifiers and structural information, crucial for ADME analysis.

  • Input: filtered_cp_above_90.csv (or any of the filtered files from Step 1).
  • Process:
    • Reads the filtered data and extracts drug names.
    • Utilizes the PubChem PUG REST API to fetch the Compound ID (CID) for each drug name.
    • Uses the obtained CID to retrieve the SMILES (Simplified Molecular-Input Line-Entry System) notation, a standard chemical structure representation.
    • Includes a time.sleep delay between API requests to prevent rate limiting.
  • Output:
    • compounds.csv: A CSV file listing the Compound Name, CID, and SMILES Notation for compounds successfully processed.
    • molecules_for_adme.txt: A text file containing SMILES notations, formatted specifically for direct input into SwissADME (SMILES followed by compound name on each line).

Future Enhancements

The upcoming phases of this project will include:

  • Pharmacokinetic Evaluation: Detailed ADME screening using SwissADME results (or similar tools) to assess absorption, distribution, metabolism, and excretion profiles.
  • Toxicity Assessment: Prediction of toxicities utilizing the Tox21 dataset and relevant models.
  • Clustering Analysis: Application of unsupervised learning methods such as KMeans and Hierarchical Clustering to identify significant drug groups and patterns within the repurposing results.
  • Quantitative Structure-Activity Relationship (QSAR) Modeling: Implementation of QSAR models using various machine learning techniques including:
    • Random Forest
    • Logistic Regression
    • Support Vector Machine (SVM)
    • Gradient Boosting
    • Comparison of results with Deep Neural Network (DNN) approaches.

Stay tuned for these comprehensive updates!

Technology Stack

  • Python
  • Jupyter Notebook
  • Pandas: For data manipulation and CSV file processing.
  • Requests: For making HTTP requests to external APIs (e.g., PubChem).
  • CSV: For handling CSV file writes.
  • Time: For managing API request rates.
  • (Future) Scikit-learn: For various machine learning algorithms (QSAR, clustering, etc.).
  • (Future) Matplotlib & Seaborn: For data visualization.
  • (Future) TensorFlow/Keras or PyTorch: For Deep Neural Network implementations.

Example Outputs

Below are examples of the intermediate and final files generated by the current script:

export.csv (Input Example):

Rank,Score,Type,ID,Name,Description
1,99.98,oe,ccsbBroad304_01966,RUVBL1,ATPases / AAA-type
2,99.98,kd,CGS001-8848,TSC22D1,-
3,99.96,oe,ccsbBroad304_00841,IKBKB,IKK family
4,99.94,kd,CGS001-1196,CLK2,CDC-like kinases
5,99.88,cp,BRD-A02333338,cyclopamine,Smoothened receptor antagonist
...

filtered_cp_above_90.csv (Example Output from Step 1):

Rank,Score,Type,ID,Name,Description
5,99.88,cp,BRD-A02333338,cyclopamine,Smoothened receptor antagonist
20,99.25,cp,BRD-K90543092,levonorgestrel,Estrogen receptor agonist
21,99.21,cp,BRD-K59456551,methotrexate,Dihydrofolate reductase inhibitor
...

compounds.csv (Example Output from Step 2):

Compound Name,CID,SMILES Notation
cyclopamine,442972,CC1CC2C(C(C3(O2)CCC4C5CC=C6CC(CCC6(C5CC4=C3C)C)O)C)NC1
levonorgestrel,13109,CCC12CCC3C(C1CCC2(C#C)O)CCC4=CC(=O)CCC34
methotrexate,126941,CN(CC1=CN=C2C(=N1)C(=NC(=N2)N)N)C3=CC=C(C=C3)C(=O)NC(CCC(=O)O)C(=O)O
...

molecules_for_adme.txt (Example Output from Step 2):

CC1CC2C(C(C3(O2)CCC4C5CC=C6CC(CCC6(C5CC4=C3C)C)O)C)NC1 cyclopamine
CCC12CCC3C(C1CCC2(C#C)O)CCC4=CC(=O)CCC34 levonorgestrel
CN(CC1=CN=C2C(=N1)C(=NC(=N2)N)N)C3=CC=C(C=C3)C(=O)NC(CCC(=O)O)C(=O)O methotrexate
...

About

Post Drug repurposing analysis utilizing ADME screening, toxicity predictions with Tox21, and QSAR modeling to evaluate absorption, distribution, metabolism, and excretion profiles, predict toxicities, and assess molecular activity. This streamlined workflow accelerates drug discovery while ensuring safety and efficacy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published