Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View JJRyan0's full-sized avatar

Block or report JJRyan0

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jjryan0/README.md

Hi there!

About Me

I’m a Senior Data Analytics Professional with 9+ years of experience delivering data-driven insights in customer-focused analytics, passionate about leveraging data to improve lives and to apply my expertise in KPI Reporting, Requirements translation, ETL, Data Modelling, ML, and data engineering to innovate in business and customer services.

What I Bring:

  • AI/ML & NLP Expertise: Building end-to-end ML pipelines with scikit-learn, IBM Watson AI, and NLP for text/speech analytics, including OCR/EDI processing for operational efficiency. NLP Projects & ML projects
  • Customer & Finance Analytics for Impact: Transforming unstructured data into actionable insights for personalized customer experiences.
  • Scalable Data Solutions: Designing ETL/ELT pipelines with Azure Data Factory, DBT, and cloud platforms (Azure, IBM Cloud) to support real-time analytics.
  • Collaborative Leadership: Bridging technical and business teams to deliver measurable outcomes, recognized with IBM Watson Health’s People’s Choice Culture Award for cross-team knowledge sharing.
  • Analytic Experience: Committed to data-driven innovation in Finance, Education, Workday, healthcare and, inspired to enhance customer outcomes.

πŸ’Ό Featured Projects

Machine Learning Projects

Project Description
πŸ“‰ XGBoost Real-time Credit Default Prediction App with FastLoop This project simulates a real-world credit scoring pipeline, built with dbt, Python, Postgres, and scikit-learn to demonstrate data engineering and MLOps capabilities. Productionalizing a machine learning workflow, FastAPI for real-time scoring, MLflow for model versioning.
🎯 Credit Default Prediction Logistic Regression This model simulates a real-world credit scoring analysis using scikit-learn Logistic Regression that predicts the Next Months Credit Card Default Payments.
πŸ€– IBM Watson AI Visual Recognition App Trained IBM Watson's image recognition service on IBM Cloud to classify road junctions for digital motor claims. Built using Python, IBM Cloud, IBM Watson Classifier, and YAML.
🧠 ML - Ensemble methods (GBM, Decision Tree) Built a machine learning system to evaluate ensemble methods - examination of improved performance by tuning the hyper-paramters learning rate using scikit-learn and pandas.
πŸ“Š Python Data Science Projects Collection of experimental analysis on regression, classification, and model evaluation notebooks (GBM, RF, Logistic Regression) using sklearn, numpy, seaborn, MLlib, and pandas.
🌊 Wave Flux - Feature Engineer Pipeline Steps Extracting raw sensor data & calculting wave engery flux as a target columns from oceangraphic wave data collected from wave buoys off the coast of Ireland via MQTT protocols to identify new predictive measures for use of onshore ML platform performance.
πŸ“ˆ ML Model for Health Diagnosis Prediction For the purpose of this analysis we will look to machine learning as a method to predict diagnosis of cancers.

Cyber Security & Risk Mangement Projects

Project Description
🚨 Isolation Forest Anomaly Detection to flag suspicious login activity - MITRE ATT&CK Mapping Built a Python-based anomaly detection model using Isolation Forest to flag suspicious login behavior such as unusual login hours, new geolocations, and failed attempts.
πŸ“ˆ Bayesian Gaussian Mixture Model (GMM) - Detect Unusual Login Activity Built a threat detection model using Bayesian Gaussian Mixture Model (GMM) to flag suspicious login behavior such as unusual login hours, new geolocations, and failed attempts.
🎯 Reducing False Positives in AML Screening If a sanctions screening system was generating an excessive number of false positives, resulting in operational delays and unnecessary customer friction. To address this, combining feature engineering, rule tuning, and statistical evaluation would likely reduce the false positive rate while maintaining true positive performance..

NLP Projects

Project Description
πŸ‘Ύ NLP-Based Structured Data Extraction and Anomaly Detection for Healthcare This project extracts structured Data from medical bills, discharge summaries, or insurance claims processed via OCR and EDI, extracting and analyzing MDC codes. This is a comprehensive project based on applying NLP to OCR and EDI systems for extracting structured data from healthcare documents, specifically incorporating Major Diagnostic Category (MDC) codes, using scikit-learn.
πŸ”§Tesseract OCR - Optical Character Recognition (OCR) Extracting patient information, claim numbers, diagnoses (ICD codes), procedures (CPT/HCPCS), and billing details from: HCFA/CMS-1500 forms (physician claims)
πŸ“‰ Sentiment Analysis on Healthcare Company Reviews: NLP Project Built a sentiment analysis classifier for healthcare feedback (e.g., patient reviews) using scikit-learn and NLTK, predicting positive/negative sentiment. Demonstrates NLP preprocessing, ML modeling, and MLOps for production-ready analytics.
⚑️ PySpark MLlib TF-IDF Extraction Built a feature extraction model using TF-IDF for natural language processing tasks with PySpark MLlib.
πŸ’‘ ML model for text/speech extraction Trained a text classifier to predict the post topic i.e is the post related to science/medicine

ETL Pipelines

Project Description
πŸ›  Azure Data Factory ETL Cleaned and transformed raw e-commerce order data using ADF pipelines, staging layers, and SQL scripts.
πŸ›  dbt Project for NYC Taxi Data Sample dbt project for transforming NYC Taxi trip data using PostgreSQL.
πŸ’‘ SQL Server MERGE with Hash Keys – SCD Type 2 Demonstrates using SQL Server's MERGE with hash keys for Slowly Changing Dimensions (SCD Type 2), tracking changes in employee records.
πŸ’‘ DBT incremental model for SCD type 2 Customer Dimension Incremental model for a historical load for Type 2 Slowly Changing Dimension (SCD) in dbt using SQL Server, Compareing full source to target, expire old records, insert new records.
⚑️Pandas - Slowly Changing Dimension (SCD) Type 1 Implement Slowly Changing Dimension (SCD) Type 1 in Pandas. Update the existing records with new information when applicable. If no changes are present, keep the existing data.

RESTful APIs

Project Description
βœ… Test Script Scenarios for RESTful API This Postman and JavaScript test suite verifies the reliability and performance of an image website's RESTful API by checking response time, data accuracy, and JSON structure.
β˜‘οΈ FastAPI credit scoring application Exposed ML model as a REST API (FastAPI). This is best when model needs to score one record at a time in real-time predictions.

Data & Business Analysis

Project Description
πŸ“ˆ Calculating Revenue Growth Percentage using SQL, Pandas, Python3 Functions This is a key financial metric that shows the rate at which a value (such as revenue, profit, or customer count) has increased or decreased compared to the previous month.
🏠 Smart Kitchen - Requirements & Product Design This documentation outlines the methods used for elicitation of requirements, formulates a list of requirements taken from all stakeholders, prioritization of requirements, document the data flow diagrams for a new product, codenamed β€˜Kitchen 2020’ as a single kitchen solution.

πŸŽ“ Education

  • πŸŽ“ MSc, Global Financial Information Systems – SETU
  • πŸŽ“ Higher Diploma, Data Analytics – National College of Ireland
  • πŸŽ“ BSc, Business – South Eastern Technological University

πŸ“œ Certifications


πŸ“« Let’s Connect

If you're looking for a results-driven data engineer or analyst who can own the full data journey, from ingestion to insight, feel free to explore my repos or connect with me on LinkedIn.

Pinned Loading

  1. gradient-boosting-machines-vs-decision-tree-demo-99.30-acc-sklearn gradient-boosting-machines-vs-decision-tree-demo-99.30-acc-sklearn Public

    Jupyter Notebook 1

  2. code-R-notebooks-Rshiny-apps code-R-notebooks-Rshiny-apps Public

    Jupyter Notebook 1

  3. Machine Learning - Accelarating Clai... Machine Learning - Accelarating Claims Process.ipynb
    1
    {
    2
      "cells": [
    3
        {
    4
          "metadata": {},
    5
          "cell_type": "markdown",
  4. ibm-watson-visual-recognition-system-identifying-junction-types ibm-watson-visual-recognition-system-identifying-junction-types Public

    Watson Image recognition system built to identify road junction types based on road characteristics and markings from a large number of traffic incident images to help the assessment of automated d…

    Python 1

  5. NYC-Taxi-ETL-with-dbt-Core NYC-Taxi-ETL-with-dbt-Core Public

    This is a sample dbt project for transforming NYC Taxi trip data using PostgreSQL.

  6. T_SQL_Scripts T_SQL_Scripts Public

    TSQL