research-code

Computational methods for research informatics and genomics research. Code examples from bennettwaxse.com and shared analysis tools.

Overview

This repository contains analysis pipelines and tools for working with NIH's All of Us Research Program data, including:

Genomics - Variant analysis, ancestry inference, PCA workflows (PLINK2, Hail)
HPV Research - OMOP-based cohort construction
N3C/RECOVER - Long COVID phenotyping algorithms
Reference Materials - All of Us data dictionaries, PheCode mappings, utilities

Platform

Code is designed for the All of Us Researcher Workbench:

Legacy Workbench (current) - Full genomics support
Verily Workbench (new) - See _reference/verily/ for setup
Requires Google Cloud Platform (BigQuery, Cloud Storage, Dataproc)

Repository Structure

genomics/          # Genomic analysis pipelines (PLINK2, Hail, phetk)
hpv/              # HPV cohort construction
nc3/              # N3C RECOVER Long COVID algorithm
_reference/       # Reference data and utilities
  ├─ verily/      # Verily Workbench setup
  ├─ all_of_us_tables/  # CDR data dictionaries
  └─ phecode/     # PheCode mappings

Each directory contains both .py scripts and .ipynb notebooks (in notebooks/ subdirectories).

Getting Started

Review CLAUDE.md files - Each directory has guidance for working with that code
Set up environment - For Verily Workbench, run _reference/verily/00_setup_workspace.ipynb
Choose a template - Use existing scripts as starting points for your analysis

Important: All of Us Data Policies

Never share counts < 20 - Display as < 20 in all outputs
Never commit patient data - See .gitignore for protected file types
Follow data use agreements - All analyses must comply with All of Us policies

AI Assistance

This repository includes comprehensive CLAUDE.md files for use with Claude Code. These provide context about architecture, workflows, and platform-specific patterns.

Contributing

See CONTRIBUTING.md for guidelines on contributing to this repository.

License

See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

research-code

Overview

Platform

Repository Structure

Getting Started

Important: All of Us Data Policies

AI Assistance

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
_reference		_reference
genomics		genomics
hpv		hpv
nc3		nc3
.claudeignore		.claudeignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
upload_safe.sh		upload_safe.sh

License

bwaxse/research-code

Folders and files

Latest commit

History

Repository files navigation

research-code

Overview

Platform

Repository Structure

Getting Started

Important: All of Us Data Policies

AI Assistance

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages