This demo code aims to give you an overview of the process of training an ML model using the scikit-learn, which is a Python library. We will run it in an IDE called Jupyter that you can use from your browser. Jupyter supports several languages including Python.
The walk-through trains a simple model on a small set of housing data. The goal is to predict housing prices based on zip and square-footage. You want the root-mean-squared error of your prediction to be as small as possible. You will learn how to do that by:
- Visualize the data.
- Training and evaluating a ML model.
- Improving the model's performance by correctly encoding categorical data.
You can learn more about running python code in the Jupyter IDE here. Below you can learn how to launch the Jupyter environment with the preloaded data and code you need.
The simplest way to run this sample code is to click the link below. It is served up from MyBinder.org. It's a cool free open-source software (OSS) environment associated with the OSS project called binder
If you have Docker installed you can simply run the image using
docker run -p 8888:8888 mobyware/anaconda
The dockerfile is also included in the repo if you'd like to build it from scratch or modify and rebuild. To build and run use:
The "p" switch is to map the port. By default my image uses 8888.
This is more involved but more rewarding. You can run the notebook included locally or any notebook by installing the Anaconda python libraries. Instructions for you OS are here.
Once Anaconda is installed you can run the notebook by executing the command below then clicking notebook called index when your browser opens:
jupyter notebook