Turning Old Server into AI Master - Part 2 (The Software)

If you haven't read Part One, you can here - https://www.pdavies.io/blog/turning-old-server-into-ai-master/

I have taken some old Dell servers and am turning it into an AI beast to run at home and play around with. Part one deals with the hardware and the issues I had building the server; this post will be the software setup. I will not tell you how I installed Ubuntu; I think that these days, enough sites are telling you that.

Please note that I am not an expert at AI/ML, and this is/has been a learning experience for me thus what I do below may not be the best and better way of doing things, but this is just what I have pieced together so far from Chatgpt/Cluade.ai and different blogs, doc and sites.

Nvidia Software

I installed an Nvidia Tesla P100 GPU into this server, so I first installed the drivers and CUDA software.

Go to the Cuda download page and select the driver environment suitable to your system.

The first part of doing it apt-get -y install cuda-toolkit-12-8 took a long time, so beware!

Need to get 3,726 MB of archives.
After this operation, 8,869 MB of additional disk space will be used.

Now I have a really simple Ansible code for doing this, and before anyone says anything is its hardcoded to versions and blah blah and no if I were doing this for a production system I wouldn't have hard coded!

---
- name: Download NVIDIA CUDA keyring package
  get_url:
    url: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
    dest: /tmp/cuda-keyring_1.1-1_all.deb
    mode: '0644'
  register: keyring_download

- name: Install NVIDIA CUDA keyring package
  apt:
    deb: /tmp/cuda-keyring_1.1-1_all.deb
    state: present
  become: yes
  when: keyring_download.changed

- name: Update apt cache
  apt:
    update_cache: yes
  become: yes

- name: Install CUDA toolkit
  apt:
    name: cuda-toolkit-12-8
    state: present
  become: yes

- name: Install CUDA drivers
  apt:
    name: cuda-drivers
    state: present
  become: yes

- name: Clean up downloaded packages
  file:
    path: /tmp/cuda-keyring_1.1-1_all.deb
    state: absent
  become: yes

IMPORTANT: Restart the server after installation is complete!

If you don't restart and try to check out the GPU status using nvidia-smi you will get something like this:

But if you have restarted, then you'll get all the details:

Docker/Portainer

I installed Docker as I plan on running a few things in Docker to make it easier to manage; again, I have Ansible do this for me, and you can find loads of examples, so I won't get into that!

Portainer is just a management UI for Docker to make it easier again. I already have a Portainer server running, so I just added the Portainer agent for this.

Visual Studio Code Server (Coder)

My wife may also want to play around with this thing, so, I will install coder. It is just a Visual Studio Code server for multi-user access vs. code-server, which is single-user!

Hours go past, and..... I find....

Coder is not just for VS code servers; it's an awesome development platform tool, but I won't go into that today. Also, I found that I didn't need this for what I was planning; I needed a Juptyer notebooks server, and then I could connect my local VS Code to Jupter to run my code.

So, guess what is next 😄

JupyterHub

JupyterHub vs. Jupyter Notebook

These are related but distinct tools in the Jupyter ecosystem that serve different purposes:

Jupyter Notebook

Jupyter Notebook is a single-user application that allows you to create and share documents containing:

Live code
Visualizations
Narrative text
Equations

Key characteristics:

Runs locally on your computer
Serves one user at a time
Primary interface is a web browser connecting to a local server
Files are saved as .ipynb notebooks
Ideal for individual data analysis, experimentation, and sharing results

JupyterHub

JupyterHub is a multi-user server that manages and proxies multiple instances of the single-user Jupyter Notebook server.

Key characteristics:

Designed for teams, classes, or organizations
Centrally deployed on a server
Manages authentication and user sessions
Spawns individual notebook servers for each user
Can handle resources, permissions, and quotas
Supports various authentication systems (like GitHub OAuth, LDAP, etc.)
Requires more administration and setup

So now we know the difference, I went with Hub because I want multi-user. So let's look at the ansible role I built. Now I am sure there are already roles built to deploy this, but I wanted to do it to help get a true understanding of how it's working.

Lets start with the templates:

I want this to run as a systems process so I have created a service unit file

[Unit]
Description=JupyterHub
After=network.target

[Service]
User=root
# Source the GitHub OAuth environment variables if enabled
{% if jupyterhub_github_oauth_enabled | default(false) %}
EnvironmentFile={{ jupyterhub_folder }}/github_oauth.env
{% endif %}
Environment="PATH={{ jupyterhub_folder }}/venv/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
ExecStart={{ jupyterhub_folder }}/venv/bin/jupyterhub -f {{ jupyterhub_folder }}/jupyterhub_config.py
WorkingDirectory={{ jupyterhub_folder }}
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Next, the config file, and this is the real guts of it all. I put the end config file I have and explain below and if you like the ansible template I will add that below:

/opt/jupyterhub/jupyterhub_config.py

import os
import pwd
import grp
import shutil

from oauthenticator.github import LocalGitHubOAuthenticator

def create_user_dir_hook(spawner):
    """Create user directory with correct permissions before spawning the notebook server."""
    # Get username
    username = spawner.user.name

    # Define the directory path
    notebook_dir = f'/mnt/storage/jupyterhub/notebooks/{username}'

    # Create directory if it doesn't exist
    if not os.path.exists(notebook_dir):
        os.makedirs(notebook_dir, exist_ok=True)

    # Get user ID and group ID
    try:
        uid = pwd.getpwnam(username).pw_uid
        gid = pwd.getpwnam(username).pw_gid
    except KeyError:
        # If the user doesn't have a local system account, use a default
        # This might happen with GitHub OAuth where system users aren't created
        return

    # Set ownership
    os.chown(notebook_dir, uid, gid)

    # Set permissions (rwx for user, r-x for group and others)
    os.chmod(notebook_dir, 0o755)

    # Optionally, you could add starter notebooks
    # shutil.copy('/path/to/welcome.ipynb', os.path.join(notebook_dir, 'welcome.ipynb'))
    # os.chown(os.path.join(notebook_dir, 'welcome.ipynb'), uid, gid)


c = get_config()

# Basic JupyterHub settings
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 8000
c.JupyterHub.hub_ip = '0.0.0.0'
c.JupyterHub.bind_url = 'https://hub.pdavies.io'

# GitHub OAuth settings
c.JupyterHub.authenticator_class = 'oauthenticator.github.LocalGitHubOAuthenticator'
c.LocalGitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
c.LocalGitHubOAuthenticator.client_id = os.environ['GITHUB_CLIENT_ID']
c.LocalGitHubOAuthenticator.client_secret = os.environ['GITHUB_CLIENT_SECRET']

c.LocalGitHubOAuthenticator.allowed_organizations = ['TechSphereAu']

# Admin users
c.Authenticator.admin_users = set(['wonderphil'])

# Spawner configuration
c.JupyterHub.spawner_class = 'systemdspawner.SystemdSpawner'
c.SystemdSpawner.default_shell = '/bin/bash'
c.SystemdSpawner.cmd = ['/mnt/storage/opt/jupyterhub/venv/bin/jupyterhub-singleuser']
c.Spawner.notebook_dir = '/mnt/storage/jupyterhub/notebooks/{username}'
c.Spawner.default_url = '/lab'
c.Spawner.pre_spawn_hook = create_user_dir_hook

# Create system users if using local spawner
c.LocalAuthenticator.create_system_users = True

# Security
c.JupyterHub.cookie_secret_file = '/mnt/storage/opt/jupyterhub/jupyterhub_cookie_secret'

# Specify overrides from variables
c.SystemdSpawner.isolate_tmp = True
c.SystemdSpawner.isolate_devices = True
c.Spawner.http_timeout = 300
c.Spawner.debug = True
c.JupyterHub.debug_db = False
c.JupyterHub.log_level = 'DEBUG'
c.SystemdSpawner.use_sudo = False

First we have a "pre spawn hook" and the reason for this is I am having all the files stored on different drive and instead of doing symlinks or changing the system to change home locations, I have a simple hook that creates the folder and sets the correct permissions.

Next I have basic Hub config and this is to allow nginx to be reverse proxy for it:

# Basic JupyterHub settings
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 8000
c.JupyterHub.hub_ip = '0.0.0.0'
c.JupyterHub.bind_url = 'https://hub.pdavies.io'

Next is the Oauth setting, so I don't need to manage users I am using our github org, this also includes who is admin:

# GitHub OAuth settings
c.JupyterHub.authenticator_class = 'oauthenticator.github.LocalGitHubOAuthenticator'
c.LocalGitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
c.LocalGitHubOAuthenticator.client_id = os.environ['GITHUB_CLIENT_ID']
c.LocalGitHubOAuthenticator.client_secret = os.environ['GITHUB_CLIENT_SECRET']

c.LocalGitHubOAuthenticator.allowed_organizations = ['TechSphereAu']

# Admin users
c.Authenticator.admin_users = set(['wonderphil'])

Next and this was the important part to get right, as when it's wrong you get all sorts of errors and you hub server wont spaw and servers for you.

# Spawner configuration
c.JupyterHub.spawner_class = 'systemdspawner.SystemdSpawner'
c.SystemdSpawner.default_shell = '/bin/bash'
c.SystemdSpawner.cmd = ['/mnt/storage/opt/jupyterhub/venv/bin/jupyterhub-singleuser']
c.Spawner.notebook_dir = '/mnt/storage/jupyterhub/notebooks/{username}'
c.Spawner.default_url = '/lab'
c.Spawner.pre_spawn_hook = create_user_dir_hook

# Create system users if using local spawner
c.LocalAuthenticator.create_system_users = True

c.SystemdSpawner.isolate_tmp = True
c.SystemdSpawner.isolate_devices = True
c.Spawner.http_timeout = 300
c.Spawner.debug = True
c.SystemdSpawner.use_sudo = False

This basically allows me to spin up a notebook server for every user and separate it. Because I have deployed into a Python virtual environment, I needed to give the path to the jupyterhub-singleuser exe. I also tell it where to create a "home/notebook" dir for said user, tell it about the pre-hook and then some other setting to ensure they do not know anything between users.

This probably took the longest to get right, and while troubleshooting, I figured few things that are helpful:

> systemctl list-units | grep jupyter
jupyter-wonderphil-singleuser.service                                                                                 loaded active running   /mnt/storage/opt/jupyterhub/venv/bin/jupyterhub-singleuser
  jupyterhub.service                                                                                                    loaded active running   JupyterHub

Once you find the name of the server in this case jupyter-wonderil-singleuser.service you can run the following to see the issues with it:

systemctl status jupyter-wonderphil-singleuser.service 
# This was the most useful

# second was journalctl but I found this hard to keep track off
journalctl -xeu jupyter-wonderphil-singleuser.service

The biggest problem I had was the local user wonderphil not having access to the file system, so make sure if you have an issue you sudo -u <user> and try to access files and folders, run the spawner command and see what happens.

ok back to Ansible, the second last template I had was to house the github creds, put in separate file so less chance of others see it, I know its not the greatest but its better then nothing at this point.

GITHUB_CLIENT_ID={{ github_client_id }}
GITHUB_CLIENT_SECRET={{ github_client_secret }}
OAUTH_CALLBACK_URL={{ github_callback_url }}

and last template was for Nginx, now I have a main Nginx server to route all my traffic, so I just had basic nginx config to deploy:

upstream hub-{{ deployment_environment }} {
  server {{ main_server }}:{{ host_port }};
}

server {
  listen 80;
  server_name {{ hub_domain }};
  return 301 https://$host$request_uri;
}


server {
  listen 443 ssl;
  server_name {{ hub_domain }};

  access_log  /var/log/nginx/access-{{ deployment_environment }}-hub.log json_log;
  error_log  /var/log/nginx/error-{{ deployment_environment }}-hub.log;

  ssl_certificate /etc/letsencrypt/live/{{ certificate_name }}/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/{{ certificate_name }}/privkey.pem;

  ssl_protocols TLSv1.2 TLSv1.3;
  ssl_prefer_server_ciphers on;

  ssl_dhparam /etc/ssl/dhparams.pem; # openssl dhparam -out /etc/nginx/dhparam.pem 4096
  ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
  ssl_ecdh_curve secp384r1; # Requires nginx >= 1.1.0
  ssl_session_timeout  10m;
  ssl_session_cache shared:SSL:10m;
  ssl_session_tickets off; # Requires nginx >= 1.5.9
  ssl_stapling on; # Requires nginx >= 1.3.7
  ssl_stapling_verify on; # Requires nginx => 1.3.7
  resolver 8.8.8.8 valid=300s;
  resolver_timeout 5s;

  # Security headers
  add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
  add_header X-Content-Type-Options nosniff;
  add_header X-XSS-Protection "1; mode=block";

  # Proxy headers
  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header X-Forwarded-Proto $scheme;

  # WebSocket support
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "upgrade";

  # Proxy timeouts
  proxy_read_timeout 86400;
  proxy_send_timeout 86400;

  # hubHub proxy
  location / {
    proxy_pass http://hub-{{ deployment_environment }};

    # Important for hubHub to handle redirects properly
    proxy_redirect http://hub-{{ deployment_environment }}/ $scheme://$host/;
  }

  # Static files caching (optional)
  location ~* \.(?:jpg|jpeg|gif|png|ico|svg|css|js)$ {
    proxy_pass http://hub-{{ deployment_environment }};
    proxy_cache_valid 30m;
    expires 30m;
    add_header Cache-Control "public";
  }
}

ok after that comes the tasks in Ansible to deploy everything, pretty self explanatory so will share but won't go into detail:

---
- name: Install required packages
  apt:
    name:
      - python3
      - python3-pip
      - python3-dev
      - python3-venv
      - nodejs
      - npm
      - git
    state: present
  become: true

- name: Install configurable-http-proxy
  npm:
    name: configurable-http-proxy
    global: true
  become: true

- name: Create JupyterHub directory
  file:
    path: "{{ jupyterhub_folder }}"
    state: directory
    owner: root
    group: root
    mode: '0755'
  become: true

- name: Create virtual environment
  command: "python3 -m venv {{ jupyterhub_folder }}/venv"
  args:
    creates: "{{ jupyterhub_folder }}/venv"
  become: true

- name: Install JupyterHub and dependencies in virtual environment
  pip:
    name:
      - jupyterhub
      - jupyterlab
      - notebook
      - oauthenticator  # For GitHub authentication
      - jupyterhub-systemdspawner
    state: present
    virtualenv: "{{ jupyterhub_folder }}/venv"
    virtualenv_command: python3 -m venv
  become: true

- name: Create JupyterHub config file
  template:
    src: jupyterhub_config.py.j2
    dest: "{{ jupyterhub_folder }}/jupyterhub_config.py"
    owner: root
    group: root
    mode: '0644'
  become: true
  notify: Restart JupyterHub

- name: Create GitHub OAuth environment file
  template:
    src: github_oauth.env.j2
    dest: "{{ jupyterhub_folder }}/github_oauth.env"
    owner: root
    group: root
    mode: '0600'
  become: true
  when: jupyterhub_github_oauth_enabled | default(false)

- name: Create JupyterHub systemd service
  template:
    src: jupyterhub.service.j2
    dest: /etc/systemd/system/jupyterhub.service
    owner: root
    group: root
    mode: '0644'
  become: true
  notify: Restart JupyterHub

- name: Enable and start JupyterHub service
  systemd:
    name: jupyterhub
    state: started
    enabled: true
    daemon_reload: true
  become: true

For completeness here is the template for the config file as well:

import os
import pwd
import grp
import shutil

{% if jupyterhub_github_oauth_enabled | default(false) %}
from oauthenticator.github import LocalGitHubOAuthenticator
{% endif %}

def create_user_dir_hook(spawner):
    """Create user directory with correct permissions before spawning the notebook server."""
    # Get username
    username = spawner.user.name

    # Define the directory path
    notebook_dir = f'/mnt/storage/jupyterhub/notebooks/{username}'

    # Create directory if it doesn't exist
    if not os.path.exists(notebook_dir):
        os.makedirs(notebook_dir, exist_ok=True)

    # Get user ID and group ID
    try:
        uid = pwd.getpwnam(username).pw_uid
        gid = pwd.getpwnam(username).pw_gid
    except KeyError:
        # If the user doesn't have a local system account, use a default
        # This might happen with GitHub OAuth where system users aren't created
        return

    # Set ownership
    os.chown(notebook_dir, uid, gid)

    # Set permissions (rwx for user, r-x for group and others)
    os.chmod(notebook_dir, 0o755)

    # Optionally, you could add starter notebooks
    # shutil.copy('/path/to/welcome.ipynb', os.path.join(notebook_dir, 'welcome.ipynb'))
    # os.chown(os.path.join(notebook_dir, 'welcome.ipynb'), uid, gid)


c = get_config()

# Basic JupyterHub settings
c.JupyterHub.ip = '{{ jupyterhub_ip | default("0.0.0.0") }}'
c.JupyterHub.port = {{ jupyterhub_port | default("8000") }}
c.JupyterHub.hub_ip = '{{ jupyterhub_ip | default("0.0.0.0") }}'
c.JupyterHub.bind_url = '{{ jupyterhub_bind_url | default("http://:8000") }}'




{% if jupyterhub_github_oauth_enabled | default(false) %}
# GitHub OAuth settings
c.JupyterHub.authenticator_class = 'oauthenticator.github.LocalGitHubOAuthenticator'
c.LocalGitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
c.LocalGitHubOAuthenticator.client_id = os.environ['GITHUB_CLIENT_ID']
c.LocalGitHubOAuthenticator.client_secret = os.environ['GITHUB_CLIENT_SECRET']


{% if jupyterhub_github_allowed_users | default([]) %}
c.LocalGitHubOAuthenticator.allowed_users = set([{% for user in jupyterhub_github_allowed_users %}'{{ user }}'{% if not loop.last %}, {% endif %}{% endfor %}])
{% endif %}

{% if jupyterhub_github_allowed_organizations | default([]) %}
c.LocalGitHubOAuthenticator.allowed_organizations = [{% for org in jupyterhub_github_allowed_organizations %}'{{ org }}'{% if not loop.last %}, {% endif %}{% endfor %}]
{% endif %}

{% if jupyterhub_github_allowed_teams | default([]) %}
c.LocalGitHubOAuthenticator.allowed_github_teams = [{% for team in jupyterhub_github_allowed_teams %}'{{ team }}'{% if not loop.last %}, {% endif %}{% endfor %}]
{% endif %}

{% else %}
# Default PAM authentication
c.JupyterHub.authenticator_class = 'jupyterhub.auth.PAMAuthenticator'
{% endif %}

# Admin users
c.Authenticator.admin_users = set([{% for admin in jupyterhub_admins | default([]) %}'{{ admin }}'{% if not loop.last %}, {% endif %}{% endfor %}])

# Spawner configuration
c.JupyterHub.spawner_class = '{{ jupyterhub_spawner_class | default("jupyterhub.spawner.LocalProcessSpawner") }}'
c.SystemdSpawner.default_shell = '{{ jupyterhub_default_shell | default("/bin/bash") }}'
c.SystemdSpawner.cmd = {{ jupyterhub_spawner_cmd | default("None") }}
c.Spawner.notebook_dir = '{{ jupyterhub_notebook_dir | default("~/notebooks") }}'
c.Spawner.default_url = '{{ jupyterhub_default_url | default("/lab") }}'
c.Spawner.pre_spawn_hook = create_user_dir_hook

# Create system users if using local spawner
{% if jupyterhub_create_system_users | default(true) %}
c.LocalAuthenticator.create_system_users = True
{% endif %}

# Security
c.JupyterHub.cookie_secret_file = '{{ jupyterhub_cookie_secret_file }}'
{% if jupyterhub_ssl_key is defined and jupyterhub_ssl_cert is defined %}
c.JupyterHub.ssl_key = '{{ jupyterhub_ssl_key }}'
c.JupyterHub.ssl_cert = '{{ jupyterhub_ssl_cert }}'
{% endif %}

# Specify overrides from variables
{% if jupyterhub_config_overrides is defined %}
{% for key, value in jupyterhub_config_overrides.items() %}
c.{{ key }} = {{ value }}
{% endfor %}
{% endif %}

Once all deployed I was able to go to hub.pdavies.io, login and I get this:

I do a quick test by opening a new notebook and running a hello world command:

Connect VS Code to JupyterHub and this is the good part, having this server and using its compute power without having to do everything on it. So on my local Mac, I open Vs Code and install JupyterHub extension by Microsoft

And just following the quick start guide I was able to connect to my AI server running JupyterHub and run commands:

First, create a new notebook

Then,a selecting the kernel

Then run the basic command

Done, now the great thing is this code is stored on my local, I can then interact with it with all my local tools, have it run on the server built for AI/ML with little effort from here on in.

So what's next? well time to start coding and build something, apart from that I think probably need to figure out how to install different packages, have different virtual environments for different projects etc and lastly I want to figure out how I can ensure I am actually using the GPU, do I need other config, are there tools for monitoring that etc

Make sure you read the next blog, as from this point there is a redesign of how JupyterHub is setup and getting GPU to work!