iKraph: a comprehensive, large-scale biomedical knowledge graph for AI-powered, data-driven biomedical research
This repository contains code for paper "iKraph: a comprehensive, large-scale biomedical knowledge graph for AI-powered, data-driven biomedical research". It contains three parts:
- Named Entity Recognition: code for training and inference on biomedical papers to extract named entities, such as genes, drugs, chemicals and diseases.
- Relation Extraction: code for training and inference on extracting relations between the named entities.
- Repurposing: code for drug repurposing.
NER and RE have different python environment requirements. See the README.md file under each directory for instructions.
Repurposing data must be downloaded and put under repurposing/data per the instruction below.
Data for this project is available at this DOI link. The content are:
data.tar.gz: repurposing dataiKraph_full.tar.gz: complete version of iKraph data
An overview of the data is provided in the README file in the compressed tarball file. Please unzip it and put it under repurposing/data.
A comprehensive iKraph_README.md file explaining the structure and usage of the dataset is included in the downloaded tarball file. Make sure to review the file for guidance on integrating and utilizing the data effectively.
Our pipeline is trained using BioRED data.
iExplore is the query and visualization product of iKraph, the biomedical knowledge graph developed at Insilicom.