Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Computational methods for research informatics and genomics research. Code examples from bennettwaxse.com and shared analysis tools.

License

Notifications You must be signed in to change notification settings

bwaxse/research-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

research-code

Computational methods for research informatics and genomics research. Code examples from bennettwaxse.com and shared analysis tools.

Overview

This repository contains analysis pipelines and tools for working with NIH's All of Us Research Program data, including:

  • Genomics - Variant analysis, ancestry inference, PCA workflows (PLINK2, Hail)
  • HPV Research - OMOP-based cohort construction
  • N3C/RECOVER - Long COVID phenotyping algorithms
  • Reference Materials - All of Us data dictionaries, PheCode mappings, utilities

Platform

Code is designed for the All of Us Researcher Workbench:

  • Legacy Workbench (current) - Full genomics support
  • Verily Workbench (new) - See _reference/verily/ for setup
  • Requires Google Cloud Platform (BigQuery, Cloud Storage, Dataproc)

Repository Structure

genomics/          # Genomic analysis pipelines (PLINK2, Hail, phetk)
hpv/              # HPV cohort construction
nc3/              # N3C RECOVER Long COVID algorithm
_reference/       # Reference data and utilities
  ├─ verily/      # Verily Workbench setup
  ├─ all_of_us_tables/  # CDR data dictionaries
  └─ phecode/     # PheCode mappings

Each directory contains both .py scripts and .ipynb notebooks (in notebooks/ subdirectories).

Getting Started

  1. Review CLAUDE.md files - Each directory has guidance for working with that code
  2. Set up environment - For Verily Workbench, run _reference/verily/00_setup_workspace.ipynb
  3. Choose a template - Use existing scripts as starting points for your analysis

Important: All of Us Data Policies

  • Never share counts < 20 - Display as < 20 in all outputs
  • Never commit patient data - See .gitignore for protected file types
  • Follow data use agreements - All analyses must comply with All of Us policies

AI Assistance

This repository includes comprehensive CLAUDE.md files for use with Claude Code. These provide context about architecture, workflows, and platform-specific patterns.

Contributing

See CONTRIBUTING.md for guidelines on contributing to this repository.

License

See LICENSE for details.

About

Computational methods for research informatics and genomics research. Code examples from bennettwaxse.com and shared analysis tools.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published