Deepak Bhandari
Data Analyst
[email protected] Mobile: +917393888886
PROFESSIONAL SUMMARY:
Around 3+ years of experience in IT field and 2 years of experience as Data Analyst with strong technical expertise,
business experience, and communication skills to drive high-impact business outcomes through data-driven
innovations and decisions.
Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to
various business problems and generating data visualizations using Python .
Proficient in Data preparation such as Data Extraction, Data Cleansing, Data Validation and
Exploratory Data Analysis to ensure the data quality.
Expertise in transforming business requirements into analytical models, designing algorithms, building models,
developing data mining and reporting solutions that scale a cross a massive volume of structured and unstructured
data.
Data cleaning & Data Imputation (outlier detection, missing value treatment)
Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random
Forest, SVM, K-Nearest Neighbors, Bayesian,) in Forecasting/ Predictive Analytics, Segmentation methodologies,
Regression-based models, Hypothesis testing, Factor analysis/ PCA, Ensembles.
Worked and extracted data from various database sources like SQL Server.
Hand on working experience in machine learning and statistics to draw meaningful insights from data. I am good at
communication and storytelling with data.
Hands on experience on Mlib utilities such as classification, regression, clustering.
Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to
various business problems and generating data visualizations using Python.
Strong knowledge of statistical methods (regression, time series, hypothesis testing, randomized experiment),
machine learning, algorithms, data structures and data infrastructure.
Proficient in statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random
Forest, SVM, K-Nearest Neighbors) in Forecasting/Predictive Analytics, Segmentation methodologies, Regression-
based models, Hypothesis testing.
Solid team player, team builder, and an excellent communicator.
Extensive hands-on experience and high proficiency with structures, semi-structured and unstructured data, using a
broad range of data science programming languages and big data tools including Python, Spark, SQL, Scikit Learn,
Hadoop MapReduce
Expertise in Python programming with various packages including NumPy, Pandas, SciPy and Scikit Learn
Experience in working on windows,
Developed MapReduce programs to perform Data Transformation and analysis.
EDUCATION:
Bachelors of Computer Science HNB Garhwal University Srinagar (UK)
TOOLS AND TECHNOLOGIES:
Languages Python, Pandas, NumPy, SciPy, scikit learn, Matplotlib
Databases SQL, SQL server
Big Data technologies Hive, MapReduce.
Tools and Utilities SQL Server Management Studio, SQL Server Enterprise Manager,
Microsoft Office, Excel Power Pivot, Excel
Data Explorer
Machine Learning Linear Regression, Logistic Regression, Gradient boosting, Random
Forests, Maximum likelihood estimation, Clustering, Classification &
Association Rules, K-Nearest Neighbors (KNN), K-Means Clustering, Decision
Tree (CART & CHAID), Factor Analysis,
Sampling Design, Time Series Analysis, Text mining
PROFESSIONAL EXPERIENCE:
Crystal Analytix – Gurgaon July 2017 – Till date
Role: Data Analyst
Responsibilities:
Involved in Exploratory data analysis using Descriptive statistics and Data visualization to
determine the base line MLAs.
Analyzed large data sets apply machine learning techniques and develop predictive models,
statistical models and developing and enhancing statistical models by leveraging best-in-class
modeling techniques.
Coordinated the execution of A/B tests to measure the effectiveness of personalized
recommendation system.
Developed Python modules for machine learning & predictive analytics.
Worked on customer segmentation using an unsupervised learning technique - clustering.
Utilized SQL to query, manipulate data from variety data sources while maintaining data integrity.
Worked on data cleaning, data preparation and feature engineering with Python including
Numpy, SciPy, Pandas, Matplotlib, Seaborn and Scikit-learn.
Used Python to implement different machine learning algorithms including Generalized Linear
Model, SVM, Random Forest, Boosting.
Data cleaning - Fill in missing values, handle the noisy data, identify or remove outliers and
outliers and resolve inconsistencies.
Environment: Python, Numpy, Pandas, SQL, MLlib, regression, Cluster analys, linear regression, logistic
regression, Hive, random forest, Map Reduce, Machine learning algorithms.
HCL Technologies, Noida Dec 2015- Jan 2017
Role: Java Developer
Responsibilities:
Strong understanding of Object Oriented Programming (OOP).
Engineering web development, all layers, from database to services to user interfaces.
Analysis and design of System and user interfaces.
Managing requirements.
Implementing software development life cycle policies and procedures.
Highly adaptable in quickly changing technical environments with very strong organizational and
analytical skills.
Maintain high standards of software quality within the team by establishing good practices and
habits.
Adhere to high-quality development principles while delivering solutions on-time.
Code writing , technically works with the team members
Created new classes, utilities and components for implementing business logic end to end
development.
Timely Delivery of the applications without bugs after testing and providing QA support.
Environment: Core Java, JSP/Servlet, JDBC, Spring 4.0, MYSQL, HTML, Web Server: Apache Tomcat 8.0.
PROJECT DETAILS –
PROJECT-01
Title: Prediction Handle of Horse Race
Platform: Python, Machine Learning Techniques
Description: Responsible to know how to increase HANDLE of the horse race? ( HANDLE – total
money bet on Horse race).
Responsible to find out the most prominen factors which influence handle.
Explored the datasets by referring to a data dictionary provided.
Involved in multiple iterations to create a model with Python=84% (approx.)
Used a Model fit graph to validate the model. Concludes the positive and negative factors which
influence handle.
PROJECT -02
Title: Credit Scoring Model
Platform: Python, Machine Learning Techniques
Description: Responsible for calculating the Probability to default for the customers of a bank. Used
data of the customers of a bank and their credit rating (Good/Bad) based on previous history.
Responsible to build a scoring model that will predict potential credit ratings using logistic
regression.