Skip to main content

Anaconda+Jupyter Walkthrough

November 14, 2019

Join us on December 3rd in RSF X320A&B-Beaver Creek for a practical-use walkthrough for the popular data science software platforms Anaconda and Jupyter (with notable focus on utilizing notebooks) on HPC systems. This is a collaboration between the HPC Operations and Applications teams to address common use-cases, pitfalls, and other tips for effective usage of these applications on HPC and in general.

Note that we will primarily focus on the usage of Anaconda/Jupyter to run Python, not so much on Julia/R.

General concepts to be covered:

  • Brief intro to what Anaconda and Jupyter are, when/how you might typically use them, and dispelling common misconceptions
  • Common problems with having many Anaconda environments and Python environments, and solutions or techniques to mitigate
    • Where/How/Why to change where conda environments are stored
    • Managing Conda envs+cache with storage restrictions
    • "I installed modules and activate my conda environment, but I get module not found in my notebook"
    • "How do I share a conda environment with my colleagues?"
  • Various techniques to access a notebook server:
    • Run your own as a job (don't run this on a login node!)
    • Use our JupyterHub instance
    • The pros and cons of both of these approaches
    • How to run different kernels on either, to either use different environments or entirely different interpreter backends.
  • What to keep in mind while using Conda/virtual environments in a SLURM job
    • if you run a multi-node SLURM job using a virtual environment in your /home instead of /scratch  this will tank the network throughput of your job.

If you have any suggestions or requests for content to be discussed, please reach out to Michael Bartlett.

Seats are limited.  Please RSVP to the presentation.