Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View chhuang216's full-sized avatar

Block or report chhuang216

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
chhuang216/README.md

Chih-Hao (Andy) Huang

Applied AI Researcher · Data Scientist · Bioinformatics & Data Engineer
Bethesda, MD · [email protected]


About Me

I am a Ph.D. candidate in Bioinformatics & Computational Biology at George Mason University, with hands-on experience developing machine learning pipelines, building data infrastructure, and analyzing complex datasets across healthcare, clinical research, and behavioral science.

My professional background includes:

  • Regulatory science research at the U.S. FDA, where I conducted statistical analyses and quantitative biomarker evaluations using clinical trial data (ORISE Fellow)
  • Doctoral research in bioinformatics, focused on explainable AI and causal inference applied to high-dimensional gene expression datasets
  • Data engineering and behavioral modeling work for a mobile health game platform with over 500K users (Howard Delafield International)

Technical Skills

Programming & Analysis
Python · R · SQL · MATLAB · SAS · Bash · Linux

Machine Learning & AI
scikit-learn · PyTorch · TensorFlow · XGBoost · SHAP · CausalForestDML · DoWhy · EconML
Predictive Modeling · Deep Learning · Causal Inference · Model Explainability · Feature Engineering

Bioinformatics & Genomics
RNA-seq · scRNA-seq · Gene Expression Analysis · Pathway Enrichment
CDISC Standards · Biomarker Evaluation

Data Engineering & Infrastructure
ETL Pipelines · Apache Spark · AWS · MongoDB · MySQL · PostgreSQL · Docker
Data Warehousing · Data Modeling · System Backups

Visualization & Tools
Matplotlib · Seaborn · Plotly · Dash · ggplot2 · RShiny
Git · JupyterLab · VS Code · Streamlit · FastAPI


Featured Projects

To demonstrate my applied skills and technical breadth, I am building open-source projects using public and synthetic datasets. These projects cover real-world workflows in AI, bioinformatics, ETL pipelines, and cloud engineering.

gene-expression-ai-pipeline

Predict disease status from gene expression data using machine learning and SHAP-based explainability.

Tech: Python · scikit-learn · SHAP · XGBoost · Docker · Jupyter · TensorFlow

cdk-projen

AWS CDK (TypeScript) + Projen project that deploys a minimal, cost-safe foundation (public VPC, S3 artifacts, ECR, IAM) with CloudTrail, budget, tests, and GitHub Actions CI. Reproducible IaC starter for ML/AI workloads.

Tech: AWS · AWS-CDK · TypeScript · IaC · VPC · CloudFormation · ERC · Projen · CDK-NAG

student-outcomes-reporting

This project demonstrates a compact, reproducible workflow to ingest public higher-education data, prepare student-weighted metrics, and publish stakeholder-ready Tableau dashboards.

Tech: Python · T-SQL · Tableau · Docker

realtime-data-pipeline

Windows-first PySpark batch pipeline: ingest raw → bronze Parquet, run DQ checks, publish curated silver. PowerShell wrapper adds Spark hygiene, parallelism controls, and step logs.

Tech: PySpark · Spark · ETL · Python · Parquet · DQ · Bronze/Silver · Partitioning · PyArrow · PowerShell · Windows · Logging · Airflow · Git/GitHub

customer-feedback-analyzer

An NLP-powered web app to automatically analyze customer reviews. This tool discovers key topics, generates AI summaries, and assesses sentiment using BERTopic and Hugging Face Transformers. The interactive dashboard is built with Streamlit.

Tech: python · nlp · data-science · machine-learning · streamlit · hugging-face · transformers · bertopic · spacy · data-visualization · sentiment-analysis

bedrock-feedback-analyzer

An NLP app analyzing public consumer complaints using cloud-native AI. It leverages AWS Bedrock for generative summarization and sentiment analysis, and BERTopic for topic modeling. Insights are presented in an interactive Streamlit dashboard.

Tech: python · nlp · data-science · generative-ai · aws · aws-bedrock · boto3 · streamlit · bertopic · spacy · data-visualization · sentiment-analysis · prompt-engineering


Publications & Presentations

  • ALT ULN Variation in Hepatocellular DILI Evaluation – Drug Safety (Submitted, 2025)
  • Alzheimer's Disease Prediction with Machine Learning and Explainability – PLOS ONE (Submitted, 2025)
  • Data-driven Healthcare Indicators via Precision Gaming – Data & Policy, 2024
  • AI in Healthcare Economics for Resource Management – PMC, 2022
  • A Survey on Artificial Intelligence Assurance – Journal of Big Data, 2021

Presented at: AASLD 2023 · Global Digital Health Forum 2023 · GameChangers Fest 2024


Contact

Popular repositories Loading

  1. chhuang216 chhuang216 Public

  2. gene-expression-ai-pipeline gene-expression-ai-pipeline Public

    End-to-end ML/AI pipeline for gene-expression prediction & explainability

    Jupyter Notebook

  3. cdk-projen cdk-projen Public

    AWS CDK (TypeScript) + Projen project that deploys a minimal, cost-safe foundation (public VPC, S3 artifacts, ECR, IAM) with CloudTrail, budget, tests, and GitHub Actions CI. Reproducible IaC start…

    TypeScript

  4. student-outcomes-reporting student-outcomes-reporting Public

    This project demonstrates a compact, reproducible workflow to ingest public higher-education data, prepare student-weighted metrics, and publish stakeholder-ready Tableau dashboards.

    Python

  5. realtime-data-pipeline realtime-data-pipeline Public

    Windows-first PySpark batch pipeline: ingest raw → bronze Parquet, run DQ checks, publish curated silver. PowerShell wrapper adds Spark hygiene, parallelism controls, and step logs.

    Python

  6. customer-feedback-analyzer customer-feedback-analyzer Public

    An NLP-powered web app to automatically analyze customer reviews. This tool discovers key topics, generates AI summaries, and assesses sentiment using BERTopic and Hugging Face Transformers. The in…

    Python