SyntheaTM is a Synthetic Patient Population Simulator. The goal is to output synthetic, realistic (but not real), patient data and associated health records in a variety of formats.
Read our wiki for more information.
Currently, SyntheaTM features include:
- Birth to Death Lifecycle
- Configuration-based statistics and demographics (defaults with Massachusetts Census data)
- Modular Rule System
- Drop in Generic Modules
- Custom Java rules modules for additional capabilities
- Primary Care Encounters, Emergency Room Encounters, and Symptom-Driven Encounters
- Conditions, Allergies, Medications, Vaccinations, Observations/Vitals, Labs, Procedures, CarePlans
- Formats
- HL7 FHIR (R4, STU3 v3.0.1, and DSTU2 v1.0.2)
- Bulk FHIR in ndjson format (set
exporter.fhir.bulk_data = trueto activate) - C-CDA (set
exporter.ccda.export = trueto activate) - CSV (set
exporter.csv.export = trueto activate) - CPCDS (set
exporter.cpcds.export = trueto activate)
- Rendering Rules and Disease Modules with Graphviz
These instructions are intended for those wishing to examine the Synthea source code, extend it or build the code locally. Those just wishing to run Synthea should follow the Basic Setup and Running instructions instead.
System Requirements: SyntheaTM requires Java 11 or newer.
To clone the SyntheaTM repo, then build and run the test suite:
git clone https://github.com/synthetichealth/synthea.git
cd synthea
./gradlew build check test
The default properties file values can be found at src/main/resources/synthea.properties.
By default, synthea does not generate CCDA, CPCDA, CSV, or Bulk FHIR (ndjson). You'll need to
adjust this file to activate these features. See the wiki
for more details.
Using this command will generate an identical population. You can also use this command to update a population after new labs or results are added. This command will completely rebuild the population of patients however it will be identical to the one before, plus any new data that may get added. To run this generater you dont need to pass and parameters but you may add them as you like. See the sh file for specifics on how the population is seeded to be the same.
./generate_identical_synthea_pop
To run a population of a specific size
./generate_identical_synthea_pop -p [Integer number of patients]
Using the -p will not change the population, simply the number of patients returned. The will be identical as before and are created in the same order.
Generateing many populatons of 10k patients at a time - using loop_synthea.sh
./loop_synthea 10
Where 10 is the number of times we want to generate a population of 10k patients. This uses append mode and will move the files around to the csv/partitions folder when complete.
Generating the population one at a time...
./run_synthea
Command-line arguments may be provided to specify a state, city, population size, or seed for randomization.
run_synthea [-s seed] [-p populationSize] [state [city]]
Full usage info can be printed by passing the -h option.
$ ./run_synthea -h
> Task :run
Usage: run_synthea [options] [state [city]]
Options: [-s seed]
[-cs clinicianSeed]
[-p populationSize]
[-r referenceDate as YYYYMMDD]
[-g gender]
[-a minAge-maxAge]
[-o overflowPopulation]
[-c localConfigFilePath]
[-d localModulesDirPath]
[-i initialPopulationSnapshotPath]
[-u updatedPopulationSnapshotPath]
[-t updateTimePeriodInDays]
[-f fixedRecordPath]
[-k keepMatchingPatientsPath]
[--config*=value]
* any setting from src/main/resources/synthea.properties
Examples:
run_synthea Massachusetts
run_synthea Alaska Juneau
run_synthea -s 12345
run_synthea -p 1000
run_synthea -s 987 Washington Seattle
run_synthea -s 21 -p 100 Utah "Salt Lake City"
run_synthea -g M -a 60-65
run_synthea -p 10 --exporter.fhir.export=true
run_synthea --exporter.baseDirectory="./output_tx/" Texas
run_synthea -p 10000 --exporter.years_of_history=0 -m tgh_hcm_study_1
Some settings can be changed in ./src/main/resources/synthea.properties.
SyntheaTM will output patient records in C-CDA and FHIR formats in ./output.
Generate graphical visualizations of SyntheaTM rules and modules.
./gradlew graphviz
Generate a list of concepts (used in the records) or attributes (variables on each patient).
./gradlew concepts
./gradlew attributes
Copyright 2017-2022 The MITRE Corporation
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.