This repository contains the homework and other course material for each of the lessons during the summer school.
Homework is due the Monday before the next lesson at 12:00 and should be handed in by sending the completed notebooks directly to one of the teachers using the IBS Academy Slack.
Make sure you save the notebook with the output, do not clear your output before saving.
For beginners we recommend to use both:
Follow the steps for Anaconda Navigator under installation. You can then install JupyterLab using the Anaconda navigator UI.
If a local installation is not working for you, we recommend to use Google Colab. You can load this repo in Colab by following the menu:
- File -> Open Notebook -> GitHub (https://github.com/sijmenw/IBS-python-for-data-science).
- Select the notebook you would like to work on
- Download the datafile using
!wget [url]
, i.e. for the pokemon.csv set:!wget https://github.com/sijmenw/IBS-python-for-data-science/raw/master/homework/week1/pokemon.csv
- Work on the notebook and download the notebook when you're done
If you're familiar with the command line, feel free to use conda
:
If you're running into trouble, let us know in Slack!
Work through each of the lessons provided here.
Links:
You can schedule the exam with the provided link (see Slack channel) when you've completed the homework.
The exam will be held in the NewRow environment you are familiar with from the classes.
The exam will consist of three parts:
- Presentation
- Live coding
- Questions
For the presentation, you will receive a dataset that you need to work on using a Jupyter notebook. In your work, touch on each of the following topics:
- EDA
- Exploratory data analysis - show some data aggregations and visualisations
- Data engineering
- Fix data types, NaN values, etc.
- Feature engineering
- Create a new feature (column) to enrich the data
- Machine learning
- Train an algorithm to make a prediction based on the data
- Evaluation
- Discussion
You are expected to cover each of these only shortly, no in-depth work is required. A suggestion for the first four steps is provided.
We encourage you to use the notebook you worked on for the presentation, when you do, use Markdown cells to divide the notebook into sections.
The Slack channel is open for questions and further clarification.
Good luck!