Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@briangow
Copy link
Collaborator

@briangow briangow commented Nov 6, 2025

This PR should be merged after the PRs ahead of it.

In #57 , we updated visit_source_value to simply be hadm_id instead of a concatenation of hadm_id and other variables. This PR updates the JOIN statements which use visit_source_value to simply JOIN against hadm_id, see 225c79c.

src.source_concept_id AS drug_source_concept_id,
src.route_source_code AS route_source_value,
src.dose_unit_source_code AS dose_unit_source_value,
`@etl_project.@etl_dataset`.obf_id_str(src.trace_id, 32) AS drug_exposure_id,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried about uniqueness of drug_exposure_id when the generated value is based on src.trace_id only: drug_exposure source rows can be multiplied due to multiple mapping of drug_source_value (for example, when a drug mapped to ingredients concepts). Maybe src.trace_id + drug_concept_id would be a safer choice?

Copy link
Collaborator

@atsvetkova-ody atsvetkova-ody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the changes, and I have only two notes:

  1. about drug_exposure_id: I expect some risk of generating duplicated values when only src.trace_id is passed to the obfuscating function. May be src.trace_id + drug_concept_id could be a safer option.
  2. about visit_source_value, just an idea probably to think it over later: we used to populate visit_source_value with the source visit_id to track original visits. But having trace_id field or with adding another non-OMOP field dedicated to source visit_id, it can be safe to return initial role to visit_source_value, i. e. to populate it with the values used to find visit_concept_id.

@briangow
Copy link
Collaborator Author

Thanks for the feedback @atsvetkova-ody !

about drug_exposure_id: I expect some risk of generating duplicated values when only src.trace_id is passed to the obfuscating function. May be src.trace_id + drug_concept_id could be a safer option.

I've updated this to combine trace_id with drug_concept_id/src.target_concept_id to ensure this is unique.

about visit_source_value, just an idea probably to think it over later: we used to populate visit_source_value with the source visit_id to track original visits. But having trace_id field or with adding another non-OMOP field dedicated to source visit_id, it can be safe to return initial role to visit_source_value, i. e. to populate it with the values used to find visit_concept_id.

I don't think I understand what you are suggesting here. Looking at the old code, on main, visit_source_value is being set based on a number of different variables depending on the source table ( as seen in AS source_value in https://github.com/OHDSI/MIMIC/blob/main/etl/etl/lk_vis_part_2.sql ). The thought for updating this was that since hadm_id is unique per visit, that it cleanly represents the visit_source_value.

Can I ask if you've reviewed all of the open PR commits (most of which are in this PR since I've been building on top of PRs to avoid merge conflicts)? I ask so I can decide what to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants