Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A python library to convert MEDS datasets into RDF using the MEDS Ontology

License

Notifications You must be signed in to change notification settings

TeamHeKA/meds2rdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MEDS2RDF

Latest Release Tests Python 3.12 License DOI

Convert MEDS datasets into RDF using the MEDS Ontology

MEDS (Medical Event Data Standard) is a standard schema for representing longitudinal medical event data. This library, meds2rdf, converts MEDS-compliant datasets into RDF triples using the MEDS Ontology.

Features

  • Convert MEDS datasets (Data, Codes, Labels, Subject Splits) into RDF.
  • Supports all MEDS value modalities: numeric, text, images, waveforms.
  • Fully links:
    • Events to Subjects
    • Codes to metadata
    • Labels to prediction samples
    • Subjects to splits
    • Events and Codes to dataset metadata
  • Outputs RDF in Turtle format (.ttl) ready for use with standard RDF tools.

Installation

From the repo root:

git clone https://github.com/TeamHeKA/meds2rdf.git
cd meds2rdf
pip install -e .

You can install it directly from GitHub:

pip install git+https://github.com/TeamHeKA/meds2rdf.git

How to Use

from meds2rdf import MedsRDFConverter

# Initialize the converter with the path to your MEDS dataset directory
converter = MedsRDFConverter("/path/to/your/meds_dataset")

# Convert the dataset into an RDF graph
graph = converter.convert(
    include_dataset_metadata=True,
    include_codes=True,
    include_labels=True,
    include_splits=True,
    generate_code_nodes=False,
    shacl_path=None
)

# Serialize the graph to different formats
graph.serialize(destination="output_dataset.ttl", format="turtle")
graph.serialize(destination="output_dataset.xml", format="xml")
graph.serialize(destination="output_dataset.nt", format="nt")

print("Conversion complete! RDF files saved.")

Notes

  • Make sure your MEDS dataset directory contains the expected structure:

    • metadata/dataset.json
    • metadata/codes.parquet (optional)
    • metadata/subject_splits.parquet (optional)
    • data/ folder with Parquet files
    • labels/ folder with label Parquet files
  • The convert method returns an rdflib.Graph object that you can further manipulate or serialize.

Here’s a clean “How to run tests” section you can drop straight into your README. It matches your project structure and the earlier import issue you hit.


Running Tests

This project uses pytest.

Install development dependencies

From the repository root:

python -m venv .venv
source .venv/bin/activate   # Linux/macOS
# .venv\Scripts\activate    # Windows

pip install -e .[dev]

If you don’t have optional dev dependencies set up, install pytest manually:

pip install pytest

Installing in editable mode (-e) is important so Python can import the meds2rdf package during tests.

Run the full test suite

From the repository root:

pytest

Cite this Repository

If you use meds2rdf in your research, please cite it as follows:

BibTeX

@software{meds2rdf,
  title        = {meds2rdf: Converting MEDS Datasets to RDF Using the MEDS Ontology},
  author       = {{Alberto Marfoglia and Contributors}},
  year         = {2025},
  url          = {https://github.com/TeamHeKA/meds2rdf},
  note         = {Python library for converting MEDS-compliant datasets into RDF}
}

About

A python library to convert MEDS datasets into RDF using the MEDS Ontology

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages