Jupyter Ascending me into stress and anxiety
Not that I’m upset about it or anything, but I’ve spent the last couple of days week - it’s been a week - trying to get a multi-user instance of JupyterHub to honour system-wide environment settings. It’s not a thing, don’t go looking for it - you’ll end up like me; bitter and ginger.
Sometimes it’s good to have your expectations challenged.
I guess that from the lack of documentation on the topic, we have a relatively rare configuration running. Either that or I’m blatantly not grasping this. I may write about it more in the future, but in brief:
- CentOS 7
- Jupyter{Hub,Lab}
- configured with SudoSpawner to spawn non-root instances
- all connections outside the office go through a proxy
Getting to it…
Long story short, you need to configure the proxy specifically for JupyterHub.
Per-user proxy
In the case of a single user, create ~/.ipython/profile_default/startup/00-startup.py
and paste in the below, modifying to meet your requirements.
import sys,os,os.path
os.environ['http_proxy']="http://proxy.domain:port"
os.environ['https_proxy']="http://proxy.domain:port"
Once complete, you can run %env
within a notebook to confirm the variables are set.
{'PATH': '$PATH:/usr/bin:/usr/local/lib/npm/bin/:/usr/local/bin:/usr/local/texlive/2020/bin/x86_64-linux',
...
'http_proxy': 'http://proxy.domain:port',
'https_proxy': 'http://proxy.domain:port'}
Honestly, I’ve had a hell of a time with this and received so many mixed
results. Initially this worked for me when setting the variables in
~/.jupyter/.profile_default/.../...
, but on trying again, it didn’t take.
I highly recommend clearing the cache any time you make changes to your environment - doing this would have saved me a lot of time.
Setting variables across all users
I came across a blog post where the creator found a solution to passing variables into Jupyter Notebook. This helped me immensely. Now knowing this, I went back and found the spawner API documentation for reference.
I skipped editing the service for JupyterHub, opting to configure everything directly in /etc/jupyterhub/jupyterhub-config.py
.
#--------------------------
# Proxy configuration
#--------------------------
import os
os.environ['http_proxy']=os.environ['https_proxy']='http://proxy.domain:port'
os.environ['no_proxy'] = '.internal.domain.org,localhost,127.0.0.1'
c.Spawner.env_keep.extend(['http_proxy','https_proxy','no_proxy'])
In my case, the http_proxy
and https_proxy
addresses are the same, so to save adding an extra line, I configured as above.
no_proxy
is required here, otherwise all requests are passed through the proxy, resulting in SudoSpawner losing the ability to communicate with the JupyterHub API running locally on the same server. I added the internal domain there to avoid sending traffic over the proxy when it isn’t necessary to do that.
The more you care, the more
the worldJupyterHub finds ways to hurt you for it.~ Jupiter Jones