Cloud Example Jupyter: Difference between revisions

From Computer Science Wiki
Jump to navigation Jump to search
Carnold (talk | contribs)
No edit summary
Carnold (talk | contribs)
No edit summary
Line 5: Line 5:


=== Basics ===
=== Basics ===
Here is what the '''Workloads''' tab looks like:
Here is what the '''[https://rancher.com/docs/rancher/v2.x/en/k8s-in-rancher/workloads/ Workloads]''' tab looks like:


[[File:Jupyter1.JPG]]
[[File:Jupyter1.JPG]]


Details of each deployment:
'''Details of each ''deployment:'''''
==== continuous-image-puller ====
This ''[https://rancher.com/docs/rancher/v2.x/en/k8s-in-rancher/workloads/deploy-workloads/ deployment]'':
* Runs a copy of itself on all ''[https://kubernetes.io/docs/concepts/architecture/nodes/ physical nodes]'' of the cluster.  This is called a ''daemon set'' by kubernetes.
* The purpose of this ''deployment'' is to pull the single-user docker image ahead of time on each node, and automatically on any new nodes to the cluster.

Revision as of 12:31, 16 August 2019

Work in progress

Introduction

The goal of this project is to support a class teaching basic programming using Jupyter Notebook Jupyter notebook is a single process that supports only one person. To support a whole class, a jupyter notebook process will need to be run for each student. Jupyter offers a Jupyter Hub that automatically spawns these singe user processes. However, a single machine can only support around 50-70 students before suffering performance issues. Kubernetes allows this process to scale out horizontally by spreading the single-user instances across physical nodes. I will give a break down of different pieces needed to make this work and go into more detail on certain aspects.

Basics

Here is what the Workloads tab looks like:

Details of each deployment:

continuous-image-puller

This deployment:

  • Runs a copy of itself on all physical nodes of the cluster. This is called a daemon set by kubernetes.
  • The purpose of this deployment is to pull the single-user docker image ahead of time on each node, and automatically on any new nodes to the cluster.