Deploying Jupyter Notebooks
for Students and Researchers
https://github.com/minrk/jupyterhub-pydata-2016
Min Ragan-Kelley*, Kyle Kelley, Thomas Kluyver
PyData London, 2016
git clone https://github.com/minrk/jupyterhub-pydata-2016 /srv/jupyterhub
What is a Notebook?
• Document
• Environment
• Web app
https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
What is a Notebook
Server?
• Manages authentication
• Spawns single-user servers on-
demand
• Each user gets a complete
notebook server
• Initial request is handled by Hub
• User authenticates via form /
OAuth
• Spawner starts single-user server
• Hub notifies Proxy
• Redirects user to /user/[name]
• Single-user Server verifies auth
with Hub
Installation (as admin)
conda:
conda install -c conda-forge jupyterhub
conda install notebook
pip, npm:
python3 -m pip install jupyterhub
npm install -g configurable-http-proxy
test:
jupyterhub -h
configurable-http-proxy -h
Installation (this repo)
conda env create -f environment.yml
source activate jupyterhub-tutorial
Installation: Caveats
JupyterHub installation must be readable
+executable by all users*
This is often not the case for envs, so be careful
*when using local users
Plug: conda-forge
Community-managed conda packages.
https://conda-forge.github.io
conda config --add channels conda-forge
Installation
https://docs.docker.com/engine/installation
pip install dockerspawner
docker pull jupyterhub/singleuser
JupyterHub Defaults
• Authentication: PAM (local users, passwords)
• Spawning: Local users
• Hub must run as root
Aside: SSL
• JupyterHub is an authenticated service - users login.
That should never happen over plain HTTP.
• For testing, we can generate self-signed certificates:
openssl req -x509 -nodes -days 365 -newkey rsa:1024 \
-keyout jupyterhub.key -out jupyterhub.crt
Note: Safari will not connect websockets to untrusted (self-signed) certs
Aside: Let's Encrypt
• https://letsencrypt.org/getting-started/
• Free SSL for any domain
git clone https://github.com/letsencrypt/letsencrypt
cd letsencrypt
./letsencrypt-auto certonly --standalone -d mydomain.tld
key: /etc/letsencrypt/live/mydomain.tld/privkey.pem
cert: /etc/letsencrypt/live/mydomain.tld/fullchain.pem
Start configuring JupyterHub
jupyterhub --generate-config
c.JupyterHub.ssl_key = 'jupyterhub.key'
c.JupyterHub.ssl_cert = 'jupyterhub.crt'
c.JupyterHub.port = 443
Installing kernels for all users
conda create -n py2 python=2 ipykernel
conda run -n py2 -- ipython kernel install
jupyter kernelspec list
Using GitHub OAuth
https://github.com/settings/applications/new
Using GitHub OAuth
In ./env:
export GITHUB_CLIENT_ID=from_github
export GITHUB_CLIENT_SECRET=from_github
export OAUTH_CALLBACK_URL=https://YOURDOMAIN/hub/oauth_callback
source ./env
Using GitHub OAuth
We need OAuthenticator:
python3 -m pip install oauthenticator
In jupyterhub_config.py:
from oauthenticator.github import LocalGitHubOAuthenticator
c.JupyterHub.authenticator_class = LocalGitHubOAuthenticator
c.LocalGitHubOAuthenticator.create_system_users = True
Specifying users
By default, any user that successfully authenticates is allowed to use
the Hub.
This is appropriate for shared workstations with PAM Auth, but
probably not GitHub:
# set of users allowed to use the Hub
c.Authenticator.whitelist = {'minrk', 'takluyver'}
# set of users who can administer the Hub itself
c.Authenticator.admin_users = {'minrk'}
Custom Authenticators
Using DockerSpawner
We need DockerSpawner:
python3 -m pip install dockerspawner netifaces
docker pull jupyterhub/singleuser
In jupyterhub_config.py:
from oauthenticator.github import GitHubOAuthenticator
c.JupyterHub.authenticator_class = GitHubOAuthenticator
from dockerspawner import DockerSpawner
c.JupyterHub.spawner_class = DockerSpawner
Using DockerSpawner
from dockerspawner import DockerSpawner
c.JupyterHub.spawner_class = DockerSpawner
# The Hub's API listens on localhost by default,
# but docker containers can't see that.
# Tell the Hub to listen on its docker network:
import netifaces
docker0 = netifaces.ifaddresses('docker0')
docker0_ipv4 = docker0[netifaces.AF_INET][0]
c.JupyterHub.hub_ip = docker0_ipv4['addr']
Using DockerSpawner
• There is *loads* to configure with Docker
• Networking configuration
• Data volumes
• DockerSpawner.container_image = 'jupyterhub/singleuser'
Customizing
Spawners
JupyterHub with supervisor
apt-get install supervisor
# /etc/supervisor/conf.d/jupyterhub.conf
[program:jupyterhub]
command=bash launch.sh
directory=/srv/jupyterhub
#!/usr/bin/env bash autostart=true
# /srv/jupyterhub/launch.sh autorestart=true
set -e startretries=3
source env exitcodes=0,2
exec jupyterhub $@ stopsignal=TERM
redirect_stderr=true
stdout_logfile=/var/log/jupyterhub.log
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=10
user=root
Reference Deployments
https://github.com/jupyterhub/jupyterhub-deploy-docker
docker-compose, DockerSpawner, Hub in Docker
https://github.com/jupyterhub/jupyterhub-deploy-teaching
ansible, no docker, nbgrader
Docker Deployment
• Docker Compose: https://docs.docker.com/compose/install/
• git clone https://github.com/jupyterhub/jupyterhub-deploy-docker
• Create a network:
docker network create jupyterhub-network
• Create a volume for secrets:
docker volume create --name jupyterhub-secrets
• Create a data volume:
docker volume create --name jupyterhub-data
Docker Deployment
• mkdir secrets
• Copy SSL key, cert to:
• secrets/jupyterhub.cer (cert)
• secrets/jupyterhub.key (key)
Docker Deployment
Make userlist:
minrk admin
takluyver
Docker Deployment
Launch: 🚀
docker-compose up
Optimizations
and best practices
• Always use SSL!
• Use postgres for the Hub database
• Put nginx in front of the proxy
• Run cull-idle-servers service to prune resources
• Global configuration in /etc/jupyter and /etc/
ipython
• Back up your user data!!!
When to use JupyterHub
• A class where students can do homework
(nbgrader)
• A short-lived workshop, especially if installation is
hard
• A research group with a shared workstation or
small cluster
• On-site computing resources for researchers and
analysts at an institution
When not to use JupyterHub
• JupyterHub is Authenticated and Persistent
• tmpnb: anonymous, ephemeral notebooks
• binder: tmpnb + GitHub repos
• SageMathCloud is hosted and provides realtime-
collaboration
API