Convert MEDS datasets into RDF using the MEDS Ontology
MEDS (Medical Event Data Standard) is a standard schema for representing longitudinal medical event data. This library, meds2rdf, converts MEDS-compliant datasets into RDF triples using the MEDS Ontology.
- Convert MEDS datasets (Data, Codes, Labels, Subject Splits) into RDF.
- Supports all MEDS value modalities: numeric, text, images, waveforms.
- Fully links:
- Events to Subjects
- Codes to metadata
- Labels to prediction samples
- Subjects to splits
- Events and Codes to dataset metadata
- Outputs RDF in Turtle format (
.ttl) ready for use with standard RDF tools.
From the repo root:
git clone https://github.com/TeamHeKA/meds2rdf.git
cd meds2rdf
pip install -e .You can install it directly from GitHub:
pip install git+https://github.com/TeamHeKA/meds2rdf.gitfrom meds2rdf import MedsRDFConverter
# Initialize the converter with the path to your MEDS dataset directory
converter = MedsRDFConverter("/path/to/your/meds_dataset")
# Convert the dataset into an RDF graph
graph = converter.convert(
include_dataset_metadata=True,
include_codes=True,
include_labels=True,
include_splits=True,
generate_code_nodes=False,
shacl_path=None
)
# Serialize the graph to different formats
graph.serialize(destination="output_dataset.ttl", format="turtle")
graph.serialize(destination="output_dataset.xml", format="xml")
graph.serialize(destination="output_dataset.nt", format="nt")
print("Conversion complete! RDF files saved.")-
Make sure your MEDS dataset directory contains the expected structure:
metadata/dataset.jsonmetadata/codes.parquet(optional)metadata/subject_splits.parquet(optional)data/folder with Parquet fileslabels/folder with label Parquet files
-
The
convertmethod returns anrdflib.Graphobject that you can further manipulate or serialize.
Here’s a clean “How to run tests” section you can drop straight into your README. It matches your project structure and the earlier import issue you hit.
This project uses pytest.
From the repository root:
python -m venv .venv
source .venv/bin/activate # Linux/macOS
# .venv\Scripts\activate # Windows
pip install -e .[dev]If you don’t have optional dev dependencies set up, install pytest manually:
pip install pytestInstalling in editable mode (
-e) is important so Python can import themeds2rdfpackage during tests.
From the repository root:
pytestIf you use meds2rdf in your research, please cite it as follows:
@software{meds2rdf,
title = {meds2rdf: Converting MEDS Datasets to RDF Using the MEDS Ontology},
author = {{Alberto Marfoglia and Contributors}},
year = {2025},
url = {https://github.com/TeamHeKA/meds2rdf},
note = {Python library for converting MEDS-compliant datasets into RDF}
}