Applied AI Researcher · Data Scientist · Bioinformatics & Data Engineer
Bethesda, MD · [email protected]
I am a Ph.D. candidate in Bioinformatics & Computational Biology at George Mason University, with hands-on experience developing machine learning pipelines, building data infrastructure, and analyzing complex datasets across healthcare, clinical research, and behavioral science.
My professional background includes:
- Regulatory science research at the U.S. FDA, where I conducted statistical analyses and quantitative biomarker evaluations using clinical trial data (ORISE Fellow)
- Doctoral research in bioinformatics, focused on explainable AI and causal inference applied to high-dimensional gene expression datasets
- Data engineering and behavioral modeling work for a mobile health game platform with over 500K users (Howard Delafield International)
Programming & Analysis
Python · R · SQL · MATLAB · SAS · Bash · Linux
Machine Learning & AI
scikit-learn · PyTorch · TensorFlow · XGBoost · SHAP · CausalForestDML · DoWhy · EconML
Predictive Modeling · Deep Learning · Causal Inference · Model Explainability · Feature Engineering
Bioinformatics & Genomics
RNA-seq · scRNA-seq · Gene Expression Analysis · Pathway Enrichment
CDISC Standards · Biomarker Evaluation
Data Engineering & Infrastructure
ETL Pipelines · Apache Spark · AWS · MongoDB · MySQL · PostgreSQL · Docker
Data Warehousing · Data Modeling · System Backups
Visualization & Tools
Matplotlib · Seaborn · Plotly · Dash · ggplot2 · RShiny
Git · JupyterLab · VS Code · Streamlit · FastAPI
To demonstrate my applied skills and technical breadth, I am building open-source projects using public and synthetic datasets. These projects cover real-world workflows in AI, bioinformatics, ETL pipelines, and cloud engineering.
Predict disease status from gene expression data using machine learning and SHAP-based explainability.
Tech: Python · scikit-learn · SHAP · XGBoost · Docker · Jupyter · TensorFlow
AWS CDK (TypeScript) + Projen project that deploys a minimal, cost-safe foundation (public VPC, S3 artifacts, ECR, IAM) with CloudTrail, budget, tests, and GitHub Actions CI. Reproducible IaC starter for ML/AI workloads.
Tech: AWS · AWS-CDK · TypeScript · IaC · VPC · CloudFormation · ERC · Projen · CDK-NAG
This project demonstrates a compact, reproducible workflow to ingest public higher-education data, prepare student-weighted metrics, and publish stakeholder-ready Tableau dashboards.
Tech: Python · T-SQL · Tableau · Docker
Windows-first PySpark batch pipeline: ingest raw → bronze Parquet, run DQ checks, publish curated silver. PowerShell wrapper adds Spark hygiene, parallelism controls, and step logs.
Tech: PySpark · Spark · ETL · Python · Parquet · DQ · Bronze/Silver · Partitioning · PyArrow · PowerShell · Windows · Logging · Airflow · Git/GitHub
An NLP-powered web app to automatically analyze customer reviews. This tool discovers key topics, generates AI summaries, and assesses sentiment using BERTopic and Hugging Face Transformers. The interactive dashboard is built with Streamlit.
Tech: python · nlp · data-science · machine-learning · streamlit · hugging-face · transformers · bertopic · spacy · data-visualization · sentiment-analysis
An NLP app analyzing public consumer complaints using cloud-native AI. It leverages AWS Bedrock for generative summarization and sentiment analysis, and BERTopic for topic modeling. Insights are presented in an interactive Streamlit dashboard.
Tech: python · nlp · data-science · generative-ai · aws · aws-bedrock · boto3 · streamlit · bertopic · spacy · data-visualization · sentiment-analysis · prompt-engineering
- ALT ULN Variation in Hepatocellular DILI Evaluation – Drug Safety (Submitted, 2025)
- Alzheimer's Disease Prediction with Machine Learning and Explainability – PLOS ONE (Submitted, 2025)
- Data-driven Healthcare Indicators via Precision Gaming – Data & Policy, 2024
- AI in Healthcare Economics for Resource Management – PMC, 2022
- A Survey on Artificial Intelligence Assurance – Journal of Big Data, 2021
Presented at: AASLD 2023 · Global Digital Health Forum 2023 · GameChangers Fest 2024
- LinkedIn: linkedin.com/in/chhuang216
- Email: [email protected]