CSE 6250 Big Data for Healthcare Term Project

ICU patient diagnosis prediction

Author

name	email	gtid
Jingyang Sui	[email protected]	jsui7
Qianzhen Li	[email protected]	qli365
Chenyu Shi	[email protected]	cshi74

Modification Based on DoctorAI

Use CCS label mapping to compress label space
Modify CCS label mapping, make some mapped integer label in original mapping smaller
Modify process_mimic.py to split input dataset to 3 folds (train, test, valid)
Generate time file to predict patient's next visit time
Generate mortality label in process_mimic_mortality.py to do mortality prediction

Install And Usage

We use Anaconda to setup python virtual environment. To run the model, follow the steps below.

Create virtual environment. At the root directory of this repo, run

conda env create -f env.yml

Activate virtual environment. At the root directory of this repo, run

activate theano-py2-env

To generate visit label prediction and visit time prediction training/testing files, put ADMISSION.csv, DIAGNOSIS_ICD.csv from mimic-III to the root directory of this repo, then run

python process_mimic.py ADMISSION.csv DIAGNOSIS_ICD.csv <your output file name prefix>

It will automatically generate all the files (visits, labels and time file) into 3 folds with .train, .test and .valid suffix that matches the training requirement. It will also print the label numbers in visit file and label files, which are arguments to doctorAI.py for proper training. Similarly, to generate mortality labels, run

python process_mimic_mortality.py ADMISSION.csv DIAGNOSIS_ICD.csv <your output file name prefix>

To train the model, run

python doctorAI.py <your visit file w/o suffix> <#labels in visit file> <your label file w/o suffix> <#labels in label file> <your model file name> {other optional arguments}

Here the visit file w/o suffix is the file name w/o .train, .test and .valid at the end. For example, if your process_mimic.py generate visit file named testdata.visits.train, testdata.visits.test and testdata.visits.valid, you should input testdata.visits as argument. Same rule applies for label and time files. If you want to also predict visit time, add additional arguments in {other optional arguments} into the above command:

--predict_time 1 --time_file <your time file w/o suffix>

For other training arguments, use python doctorAI.py --help for instructions. For more detailed information about the format of the training files, please refer to doctorAI readme.

To test the performance metric, run

python testDoctorAI.py <your model file name w/ .npz suffix> <your visit file w/ suffix> <your label file w/ suffix> <model RNN hidden dimension>  {other optional arguments}

The argument <model RNN hidden dimension> is one of the optional arguments to doctorAI.py for training. Use python doctorAI.py --help to show the default value (should be numbers enclosed by square brackets).

For TA Testing

We uploaded our model file model.15.npz and the corresponding label/visit/time files. To use them, run

python testDoctorAI.py model.15.npz test.visits.test test.labels.test [2000] --predict_time 1 --time_file test.time.test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CSE 6250 Big Data for Healthcare Term Project

ICU patient diagnosis prediction

Author

Modification Based on DoctorAI

Install And Usage

For TA Testing

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.gitignore		.gitignore
DIAGNOSES_AND_PROCEDURE_ICD.csv		DIAGNOSES_AND_PROCEDURE_ICD.csv
README.md		README.md
doctorAI.py		doctorAI.py
env.yml		env.yml
labelMap.csv		labelMap.csv
labelMap_new.csv		labelMap_new.csv
model.15.npz		model.15.npz
process_mimic.py		process_mimic.py
process_mimic_mortality.py		process_mimic_mortality.py
test.labels.test		test.labels.test
test.time.test		test.time.test
test.visits.test		test.visits.test
testDoctorAI.py		testDoctorAI.py

jysui123/cse6250project

Folders and files

Latest commit

History

Repository files navigation

CSE 6250 Big Data for Healthcare Term Project

ICU patient diagnosis prediction

Author

Modification Based on DoctorAI

Install And Usage

For TA Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages