An exploratory analysis of the Kaggle bikeshare data set with the application of linear regression models, which are not optimal for this particular problem of predicting bikes rented.

R 1 Updated Dec 11, 2018

LeondraJames / HeartDisease

"What Your Heart Is Telling You" Logit Model

Jupyter Notebook 1 Updated Dec 12, 2018

LeondraJames / Titanic_Attempt-1

Kaggle Titanic Data Set Using Logit Model

R 1 Updated Dec 13, 2018

LeondraJames / First-KNN-Attempt---ISLR-Caravan-Dataset

This is my first attempt at a KNN model, where I attempt to classify the purchase of caravan insurance in the Caravan data set (ISLR package).

R 1 Updated Jan 15, 2019

LeondraJames / LoanPaymentPrediction_SVM

My first attempt with building a SVM model, and optimizing the cost and gamma parameters using the Gaussian Kernel grid search method.

R 1 Updated Jan 21, 2019

LeondraJames / BostonHousingPrices_NeuralNet

My first attempt at implementing a neural network using the Boston housing data set from the MASS library.

R 1 Updated Jan 22, 2019

LeondraJames / ChipotleLocations

This is a descriptive and exploratory data analysis project from DataCamp which aims to explore real data on every Chipotle location to identify franchising opportunities. The goal is to scout out …

1 Updated Feb 1, 2019

LeondraJames / HarvardXCapstone---Film-Recommender-System

Capstone Submission #1 for the Harvard University Professional Certificate in Data Science.

R 1 2 Updated Feb 14, 2019

LeondraJames / Degrees-That-Pay-You-Back

A cluster analysis leveraging the kmeans algorithm to determine which degrees are likely to yield which levels of income based on historical data.

Jupyter Notebook 1 Updated Feb 26, 2019

LeondraJames / AdClick_Fraud

Capstone project #2 for the Harvard University Professional Certificate in Data Science

R 3 Updated Feb 26, 2019

LeondraJames / TV-HALFTIME-SHOWS-AND-THE-BIG-GAME

EDA project using SQL in Jupyter Notebooks, focusing on the history of games, broadcasts and performances for the National Football League

1 1 Updated Apr 30, 2019

LeondraJames / MarketBasketAnalysis-MBA-

Use of associative rule mining using the APRIORI algorithm

R 1 Updated May 1, 2019

LeondraJames / MobileGameABTest

2 A/B tests, testing the difference in 1) average player 1 day and 2) 7 day retention against control (old player level) and new version (new player level)

1 Updated Jun 14, 2019

LeondraJames / AWSSageMaker_PythonXGBoostTutorial

Python XGBoost model, using Amazon SageMaker, EC2 instances and S3 buckets. Used to prepare, partition, train, tune, predict and evaluate model. Project involves predicting customers who sign up fo…

Jupyter Notebook 1 Updated Sep 3, 2019

LeondraJames / MarkovChains_MultiTouchAttribution

Multi touch attribution models, including Markov chains

R 1 Updated Sep 9, 2019

LeondraJames / Disney-Movies-Box-Office-Hits

Analysis of Disney's top grossing films (adjusted for inflation) in Python, using regression to attribute film genre to success. The project includes using regression on the data, as well as bootst…

Jupyter Notebook 2 Updated Sep 9, 2019

LeondraJames / Film-Similarity-NLP-with-KMeans-Hierarchical-Clustering

Used NLP techniques (tokenization, stemming, vectorization for TF-IDF) and clustering algorithms (Kmeans and Hierarchical clustering) to mine the "similarities" between films based on their plots p…

Python 1 2 Updated Aug 10, 2022

LeondraJames / Whale-Image-Classification-

Computer Vision project

Jupyter Notebook 1 2 Updated Sep 27, 2019

LeondraJames / WalmartStockEDA

An EDA of Walmart stock data using Databricks, Spark and PySpark.

Jupyter Notebook 1 Updated Oct 4, 2019

LeondraJames / Hyundai-Cruise-Ship-Crew-Prediction

Predicting the number of required crew needed for manning a Hyundai Cruise ship based on information like number of cabins and passengers using linear regression. Leveraged SQL and PySpark,

Jupyter Notebook 2 8 Updated Oct 4, 2019

LeondraJames / Private_Public_Colleges

Predicting whether a university is private or public using tree based models (ie: decision tree classifier, random forest classifier and gradient boosted tree classifier) using PySpark and Databricks.

Jupyter Notebook 1 Updated Nov 25, 2019