Gene ontology(GO)-based autoencoder for embedding single-cell RNA-seq.
Generally the ontoencoder takes three input: X, y and topology (the gene ontology, or any directed acyclic graph you input)
please refer to notebooks/TopoNet* for examples of supervised learning; notebooks/OntoEncoder* for unsupervised learning.
Any single-cell RNA-seq can be log-normalized and saved as .h5ad by scanpy package.
The processing step is recorded in notebooks/GSE71585-single cell.ipynb
processed data are stored in /cellar/users/hsher/ontoencoder/notebooks/tasic.h5ad (accessible to the Ideker lab)
the topology should be stored in the DCell format. See here for an example
This topology file can be converted to OntoEncoder/TopoNet-compatible format using ontoencoder.topology.topo_reader())
Please refer to OntoPrune [https://github.com/algaebrown/ontoPrune] for more information.
I haven't implement that.
environment.yml should be helpful. Refer to here how to install the same conda environment