This is a collection of notes with introduction, examples, jupyter notebooks and codes to dive into the world of data science, assuming some prior knowledge in programming, python, and mathematics.
The contents include
- Part 1: data analysis, feature engineering, supervised models
- Part 2: unsupervised models, model evaluation and tuning
- Part 3: topics on sklearn, spark
- Part 4: project 1 - housing price prediction
- Part 5: project 2 - click-through-rate prediction
- Part 6: project 3 - p2p loan default rate
For more introduction on models or technical discussions, you may refer to my blog: https://machinelearning100days.wordpress.com
Below is a list of information I found very helpful to familiarize with the topics
- machine learning algorithms:
- kaggle kernels and discussions
- IBM data science specialization on coursera
============================================
A workaround Try to open that notebook that you want using nbviewer online, you don't need to install it.
- Open "https://nbviewer.jupyter.org/"
- Paste the link to your notebook, (e.g. "https://github.com/ozhong/Data_Science_Projects/blob/master/project_scorecard_GiveMeSomeCredit/eda_and_scorecard.ipynb") there and * * then you get "https://nbviewer.jupyter.org/github/ozhong/Data_Science_Projects/blob/master/project_scorecard_GiveMeSomeCredit/eda_and_scorecard.ipynb"
- This site nbviewer works independently of github.