Dokku Jupyterhub
The goal of this project is to run a dockerized 👉 jupyterhub instance on a 👉 dokku server.
Dokku will create and handle the docker network for the communication between jupyterhub and the jupyter notebooks (spawned as separate docker containers). Dokku performs a Dockerfile
-deploy.
The spawned notebook image is based on a Docker image build from the images/Dockerfile
after each deploy.
Data-Persistence is achieved by bind mounting directories from the dokku host to the notebook containers. See jupyterhub_config.py
for all settings.
Dokku requirements​
The following plugins are required and must be installed on your dokku host:
- 👉 post-deploy-script @lebalz to build the users jupyterlab image after each deploy
- 👉 postgres
- 👉 letsencrypt
Create jupyterhub app folder​
👉 github.com/lebalz/dokku-jupyterhub
Create a new git project with the following files:
├── Dockerfile
├── POST_DEPLOY_SCRIPT
├── README.md
├── images
│ ├── overrides.json
│ └── Dockerfile
├── jupyterhub_config.py
└── my_azuread.py
There are two Dockerfiles - the one in the root is used to build the jupyterhub image, the one in the images folder is used to build the users jupyterlab image.
The POST_DEPLOY_SCRIPT
is used to build the users jupyterlab image after each deploy.
To intialize a new repository with git:
git init
git add .
git commit -m "initial commit"
# add dokku remote
git remote add dokku dokku@<your-ip>:jupyterhub
./Dockerfile​
# Do not forget to pin down the version
FROM jupyterhub/jupyterhub:4.0.1
# Install dependencies (for advanced authentication and spawning)
RUN pip3 install \
dockerspawner==12.1.0 \
oauthenticator==14.2.0 \
jupyterhub-idle-culler==1.2.1 \
psycopg2-binary==2.9.3
RUN pip3 install PyJWT==2.3.0
# Copy the custom authenticator
COPY my_azuread.py .
RUN mv my_azuread.py $(dirname "$(python3 -c "import oauthenticator as _; print(_.__file__)")")/my_azuread.py
# Copy the JupyterHub configuration into the container
COPY jupyterhub_config.py .
# Copy the POST_DEPLOY_SCRIPT into the container
COPY POST_DEPLOY_SCRIPT .
# Copy the notebook dockerfile into the container
COPY images ./images
./jupyterhub_config.py
./my_azuread.py
Images​
The runtime image must be available on the dokku server. It is possbible to build it after each deploy (postdeploy script) or to pull the image manually on the dokku host from a registry.
Option 1: Pull an existing image​
- remove or rename the
POST_DEPLOY_SCRIPT
from the repository, otherwise it will be used to build the image... (mv POST_DEPLOY_SCRIPT _POST_DEPLOY_SCRIPT
) - pull the preferred image and configure your jupyterhub to use it:
docker pull jupyter/scipy-notebook:latest
# and set the network as default
dokku config:set $APP DOCKER_JUPYTER_IMAGE="jupyter/scipy-notebook:latest"
Option2: ./images/Dockerfile
​
(ensure you have the 👉 post-deploy-script plugin installed on your dokku host)
-
Configure a
DOCKER_JUPYTER_IMAGE
on your host - this name will be used to tag your built imagedokku config:set $APP DOCKER_JUPYTER_IMAGE="jupyter/lebalz:latest"
-
Add a
POST_DEPLOY_SCRIPT
to the root (already done here).#!/bin/bash
# create and tag image...
TAG=$(dokku config:get $APP DOCKER_JUPYTER_IMAGE)
echo $TAG
echo "start build"
(cd images && docker build . -t $TAG) -
Setup your
images/Dockerfile
FROM jupyter/minimal-notebook:lab-4.0.2
LABEL maintainer="dev-name"
LABEL version="0.0.1"
LABEL description="Jupyter Notebook image"
USER root
# graphviz and graphviz-dev is needed for use with jupyterlab
RUN apt-get update -y && apt-get install -y graphviz graphviz-dev
USER jovyan
# all additional pip packages
RUN pip3 install --no-cache \
jupyterhub==4.0.1 \
jupyterlab==4.0.2 \
notebook==6.5.4 \
numpy==1.25.0 \
Pillow==10.0.0 \
pandas==2.0.3 \
xlrd==2.0.1 \
openpyxl==3.1.2 \
ipywidgets==8.0.7 \
ipympl==0.9.3 \
jupyterlab-spellchecker==0.8.3 \
orjson==3.9.1\
graphviz==0.20
# add the overrides file to the jupyterlab image:
COPY overrides.json /opt/conda/share/jupyter/lab/settings/ -
Make sure that all your Dependencies to build your image are configured properly on your dokku host under
DOKKU_POST_DEPLOY_SCRIPT_DEPENDENCIES
dokku config:set --no-restart $APP DOKKU_POST_DEPLOY_SCRIPT_DEPENDENCIES="images/Dockerfile;images/overrides.json"
Dokku Setup​
Expecting dokku service name is set via APP
Env, e.g. APP="jupyterhub"
APP="jupyterhub"
DOMAIN="your.domain.com"
# create app
############
dokku apps:create $APP
# ensure docker networks can be used
# dokku version >= v26
dokku scheduler:set $APP selected docker-local
# for dokku dokku version < v26
# dokku config:set $APP DOCKER_SCHEDULER=docker-local
# configure port map for accessing hub
dokku config:set $APP DOKKU_PROXY_PORT_MAP="http:80:8000"
# mount docker socket to spawn new containers
dokku storage:mount $APP /var/run/docker.sock:/var/run/docker.sock
# add a domain to it
dokku domains:add $APP $DOMAIN
# create network
dokku network:create $APP
dokku network:set $APP bind-all-interfaces true
# attach the network to the app
dokku network:set $APP attach-post-create $APP
# configure env variables for the network
dokku config:set $APP DOCKER_NETWORK_NAME=$APP
dokku config:set $APP HUB_IP=$APP.web
# create postgres service
dokku postgres:create $APP
dokku postgres:link $APP $APP
## The URI should start with postgresql:// instead of postgres://.
# SQLAlchemy used to accept both, but has removed support for the
# postgres name.
DB_URL=$(dokku config:get $APP DATABASE_URL)
dokku config:set --no-restart $APP DATABASE_URL="${DB_URL//postgres:\/\//postgresql:\/\/}"
# or optional edit the env file directly
# nano /home/dokku/$APP/ENV
# configure post deploy script
# all files needed for the image build must be configured here
# ";" separated list of files (relative to the root of the app)
# - images/Dockerfile /* the dockerfile for the image build */
# - images/overrides.json /* configure jupyterlab settings */
# - ...
dokku config:set --no-restart $APP DOKKU_POST_DEPLOY_SCRIPT_DEPENDENCIES="images/Dockerfile;images/overrides.json"
# STOARGE AND DATA PERSISTENCE
##############################
mkdir -p /var/lib/dokku/data/storage/$APP/data
dokku storage:mount $APP /var/lib/dokku/data/storage/$APP/data:/data
## create shared directories
mkdir -p /var/lib/dokku/data/storage/$APP/data/shared
mkdir -p /var/lib/dokku/data/storage/$APP/data/colab
## grant user jovian:users access to shared mounted volumes
chown -R 1000:100 /var/lib/dokku/data/storage/$APP/data/shared
chown -R 1000:100 /var/lib/dokku/data/storage/$APP/data/colab
# increase max body upload size
dokku nginx:set $APP client-max-body-size 30m
# AUTHENTICATORS - OAUTH
########################
### edit your credentials: `nano /home/dokku/$APP/ENV`
# or use the dokku config:set command as shown below
## GITHUB oauth config
# dokku config:set $APP OAUTH_CALLBACK_URL="https://$DOMAIN/hub/oauth_callback"
# dokku config:set $APP GITHUB_CLIENT_ID="XXXXXXXXXXXXXX"
# dokku config:set $APP GITHUB_CLIENT_SECRET="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
## AZURE AD oauth config
# dokku config:set $APP AAD_TENANT_ID="xxxxxx-xxxxxx-xxxxxxx"
# dokku config:set $APP AAD_OAUTH_CALLBACK_URL="https://$DOMAIN/hub/oauth_callback"
# dokku config:set $APP AAD_CLIENT_ID="xxxxxx-xxxxxx-xxxxxxx"
# dokku config:set $APP AAD_CLIENT_SECRET="xxxxxx-xxxxxx-xxxxxxx"
Letsencrypt​
After the initial deploy, you can enable letsencrypt for your domain.
Make sure:
- you have set a domain and your page is reachable
- no pagerules with permanent redirects e.g. from Cloudflare exists
MAIL="your@email.address"
dokku config:set --no-restart $APP DOKKU_LETSENCRYPT_EMAIL=$MAIL
dokku letsencrypt $APP
Jupyterlab Settings​
edit the images/overrides.json