Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View DataMonki's full-sized avatar

Highlights

  • Pro

Block or report DataMonki

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DataMonki/README.md

πŸ‘‹πŸΎ Hey, I’m Chukwuka-Steve Orefo (Chuks Orefo)

Multidisciplinary Data scientist with formal training in Biochemistry, Pharmacology, Neuroscience, **Applied Data Science & Machine learning ** and Computer Science with Software engineering.


πŸ—£ Programming languages

πŸ› οΈ Tech toolkit

Domain Stack & tooling (inline)
Backend / APIs Django FastAPI Flask
Data engineering Pandas PyTorch Airflow tidyverse SQL
ML & MLOps scikit-learn TensorFlow PyTorch
DevOps & Cloud Docker Kubernetes Helm GitHub Actions Azure GCP AWS
Frontend React Next.js Flutter Bootstrap

πŸ“ˆ Current focus

  • πŸ” Data visualisation – extending a real-time SQL + Python dashboard for GP data quality with automated split/merge detection.
  • βš™οΈ Data pipeline performance optimisation via Go micro-services – refactoring a high-throughput extract pipeline, meeting SLA targets and handing over to distributed teams.
  • πŸ§ͺ Synthetic-data engineering – hardening a Kubernetes-hosted Django-Celery-Redis stack on Azure; CLI tooling and pen-test readiness.
  • 🧠 NLP prototyping – re-training and productionising a dosage-instruction classifier, pushing precision/recall beyond 0.85.
  • 🧬 Fault-injection with dummy test data – building controllable datasets to stress-test pipelines inside CI/CD.
  • πŸ”— QA automation – co-developing an automated QA library and CI hooks for linked datasets.

πŸš€ Featured project

Repo Summary Core tech

πŸ“¬ Connect

LinkedIn


GitHub Stats

Pinned Loading

  1. AdapTable AdapTable Public

    Forked from drumpt/AdapTable

    Official implementation of AdapTable: Test-Time Adaptation for Tabular Data via Shift-Aware Uncertainty Calibrator and Label Distribution Handler (NeurIPSW-TRL 2024)

    Jupyter Notebook

  2. CTGAN CTGAN Public

    Forked from sdv-dev/CTGAN

    Conditional GAN for generating synthetic tabular data.

    Python

  3. regularized-bon regularized-bon Public

    Forked from CyberAgentAILab/regularized-bon

    Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).

    Python

  4. tab-ddpm tab-ddpm Public

    Forked from yandex-research/tab-ddpm

    [ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"

    Python

  5. verl verl Public

    Forked from volcengine/verl

    verl: Volcano Engine Reinforcement Learning for LLMs

    Python