FHIR BULK DATA API
Extending FHIR to Population Level Datasets
Dan Gottlieb, Central Square Solutions, LLC (@GotDan)
Josh Mandel, Microsoft (@JoshCMandel)
Revised September 2021
slides: https://bit.ly/fhir-bulk-api
Agenda
• Why a FHIR Bulk Data API?
• Technical Architecture & what’s new in 2.0
• Open Source Tools
• Adoption!
• Next steps and how to get involved
2
FHIR APIs
FHIR REST API FHIR Bulk Data API
Patient Panel Population
3
Use Cases
• Internal clinical data warehouse for study cohort identification
• Claims in EHR to provide comprehensive view
• Machine learning startup obtaining training data from cloud EHR
• Integration population health system with EHR system
• Transferring records from one EHR to another
• Payer database to assess care quality
• Reportable disease submission or other registry
4
Let’s enhance to support population level data
access
• FHIR Resources as a standard data model to simplify data parsing and
mapping
• FHIR Operation API to initiate the data extracts
• SMART Backend Services Authorization as security model
5
Focused Scope + Complementary Technologies
• Legal framework for sharing data between partners needs to be set
up out-of-band (BAAs, SLAs, DUAs)
• Real-time data - data loaded through bulk APIs can be supplemented
with real time FHIR REST API calls or subscriptions
• Patient matching - it’s possible to include identifiers like subscriber
number in Bulk Export FHIR resources
• Data transformation - can serve as a foundation for data pipeline
6
Technical Architecture
Bulk Data Access Implementation Guide
7
Bulk Data Access IG Versions
STU1 (v1) STU2 (v2)
Initial version Incorporates experience from early
implementations
Published August 2019 Published November 2022
8
Kick-off Request
Bulk Data Kick-off Request Bulk Data
Client Server
(destination) (source)
9
Kick-off Request
• Asynchronous requests with status polling (HTTP GET only in v1, GET or POST in v2) Updated
Prefer: respond-async in v2
• FHIR Operation for all data on all patients (all data in the patient “compartment”)
[FHIR Server Base]/Patient/$export
• FHIR Operation for all data on a group of patients (eg. research cohort, plan members)
[FHIR Server Base]/Group/[group id]/$export
• FHIR Operation for all data on the server
[FHIR Server Base]/$export
10
Kick-off Operation Parameters
_outputFormat The format for the generated bulk data files
Currently, only ndjson is supported
_since Filter results by FHIR resource modified date Updated
in v2
FHIR instant timestamp
(required for servers to support)
_type Filter results by comma delimited list of FHIR resource types
(optional for servers to support)
_typeFilter Filter using FHIR REST queries
(optional and experimental)
11
Kick-off Operation Parameters New in
v2
_elements FHIR resource elements to return
e.g., Patient.id, Patient.identifier
(optional and experimental)
Patient FHIR Patient References to limit data returned
(optional, not valid for GET requests or system level requests)
includeAssociatedData Metadata resources to include with response
e.g., LatestProvenanceResources or
RelevantProvenanceResources
(optional and experimental)
12
Kick-off Response
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
13
Kick-off Response Header
Status: 202 Accepted
Content-Location: [URL for status or deleting request]
14
Status Request #1
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
GET Content Location
15
Status Response #1
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
GET Content Location
File Generation Status
16
In-Progress Status Response Header
Status: 202 Accepted
X-Progress: “50% complete”
Retry-After: 120
17
Status Request #2
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
GET Content Location
File Generation Status
GET Content Location
18
Status Response #2
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
GET Content Location
File Generation Status
GET Content Location
JSON manifest
19
Status Complete Response Body
1 "transactionTime" : "2020-07-13T13:28:17.239Z",
2 "request" : "https://example.com/Patient/$export?_type=Patient,Observation",
3 "requiresAccessToken" : true,
4 "output" : [{
"type" : "Patient",
"url" : "https://example.com/files/patient_file_1.ndjson"
},{
"type" : "Patient",
"url" : "https://example.com/files/patient_file_2.ndjson"
},{
"type" : "Observation",
"url" : "https://example.com/filesw/observation_file_1.ndjson"
}],
5 "deleted" : [{
New in "type" : "Bundle",
v2
"url" : "https://example.com/output/del_file_1.ndjson"
}],
6
"error" : [{
Updated "type" : "OperationOutcome",
in v2
"url" : "https://example.com/files/error_file_1.ndjson"
}]
20
File Request
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
GET Content Location
File Generation Status
GET Content Location
JSON manifest
GET File (eg. 0001.Observation.ndjson )
21
File Response
Bulk Data Kickoff Request Bulk Data
Client Server
(destination) (source)
Content Location
GET Content Location
File Generation Status
GET Content Location
JSON manifest
GET File (eg. 0001.Observation.ndjson )
FHIR Resources File
22
FHIR Resources
Data models representing discrete clinical and administrative units (patient, practitioner,
allergy, medication order, etc.)
• Currently around 100 have been defined
• Can reference other resources by their URL
• Don’t include the kitchen sink, but support extensions
“We only include data elements if we are confident that most normal implementations
using that resource will make use of the element”
– Grahame Grieve (FHIR Product Director)
• MU3 Common Clinical Dataset (and soon USCDI) defines subset
23
NDJSON
,
,
24
SMART Backend Services Authorization
• Out-of-band app registration (can use Dynamic Client Registration or
portal)
• Apps can register public key (JWKS format) or URL for public key
• Token requests signed with private key
• System level scope (parallels SMART “user” and “patient” scopes)
system/[resourceType].read
• Short-lived access tokens
25
Registration Flow (once)
Backend Service Configure Public Key and other OAuth settings Bulk
Admin Data
Server
OAuth Client Id
26
Authorization Flow (min. once per request)
Backend Service Configure Public Key and other OAuth settings Bulk
Admin Data
Server
OAuth Client Id
Bulk Data Client
Signed Token Request
Short Lived Access Token
27
Tutorial (Python): EHR Export to SQL Exploration
https://colab.research.google.com/drive/1HhEEB3MJ8LbMP2ta946s8OARPc5RflHu
28
Enhancements
in v2
29
Historical Group Data
• Server side with revised guidance on the “_since” parameter
“In the case of a Group level export, servers MAY return additional resources modified Updated
prior to the supplied time if the resources belong to the patient compartment of a patient in v2
added to the Group after the supplied time (this behavior should be clearly documented
by the server).”
• Client side with the “patients” and “_elements” parameters
• Make a request to get just the ids of patients in the group with “_elements” New in
• Use the “patients” parameter to get data for patients not previously retrieved v2
• Use the “patients” paramter and the “_since” parameter to get new data for remaining
patients
30
Attachments New in
v2
Resource elements of type Attachment must contain:
data element with a Base64 encoded version of the file
OR
url element with an absolute URL for the file accessible using the same
authentication as the ndjson files
31
Open Source Tools
32
SMART Reference Server Implementation
https://bulk-data.smarthealthit.org
33
SMART Sample Client
https://github.com/smart-on-fhir/sample-apps-stu3/tree/master/fhir-downloader
34
ONC Inferno Testing Tool
https://inferno.healthit.gov/community
35
FHIR Data Census Tool
https://github.com/sync-for-science/data-census
36
Adoption
37
Growing number of implementations!
Open Source FHIR Servers EHRs
• Microsoft • Epic
• HAPI • Cerner (prototype)
• IBM • T-System (prototype)
Commercial FHIR Servers Payor Data Servers
• Azure API for FHIR • CMS ACO Beneficiary Claims Data (pilot)
• CareEvolution • CMS Data at the Point of Care (pilot)
• Firely Server (SQL endpoint) • CMS Claims to Part D Sponsors (pilot)
• Google Healthcare API (in preview)
38
US Regulatory Requirements
170.215 [EHR Certification] Application Programming Interface
Standards.
The Secretary adopts the following application programming interface
(API) standards and associated implementation specifications […] FHIR
Bulk Data Access (Flat FHIR) (v1.0.0: STU 1), including mandatory
support for the “group-export” “OperationDefinition”
Implementation Date: 12/31/2022
39
Next Steps
40
Get Involved!
• Use the APIs in real-world use cases and collect ideas for v3
• Open source modules (eg. de-identification, filtering, NLP)
• Define Bulk Import Operation (updates coming soon!)
Early draft proposals at https://github.com/smart-on-fhir/bulk-import/blob/master/import-pnp.md
• Standardize analytic approaches (identify cohorts, quality measures)
41
slides: https://bit.ly/fhir-bulk-api
Resources
• Bulk Data Implementation Guide
STU2 (v2): https://hl7.org/fhir/uv/bulkdata/
STU1 (v1): http://hl7.org/fhir/uv/bulkdata/
• SMART Server Reference Implementation
https://bulk-data.smarthealthit.org
• Bulk Data Discussion Group (Bulk Data Stream on FHIR Zulip Chat)
https://chat.fhir.org/#narrow/stream/bulk.20data
42