Using Anaconda Python on the Eagle System

Anaconda Python is our actively supported distribution on the Eagle system.

To use Anaconda, run module load conda. The latest base package (Anaconda 4.9.2) is built on Python 3.9.4, but users can create custom environments with other versions, including Python 2. In all cases, the numerically intensive routines in numpy and scipy are MKL-enabled for speed.

Custom Environments

Custom environments are available for users to install libraries. Create environments in a new directory with write privileges using the -p PATH option. Some caveats:

Do not run `conda init`. It will lock python to the conda version in the .bashrc on every new shell and break compatibility with custom environments.
Do not use the --clone=root option when creating custom environments, it was not designed for use in a shared computing environment.
Home directories may not permit maintaining too many environments due to disk space quotas. For project-specific or temporary environments, you may install environments in /projects or /scratch/$USER (note however that environments in scratch will be deleted after a month of inactivity).
- By default, Anaconda doesn't create a name for environments in non-default locations. To add a location for Anaconda to index, use the command
```
conda config --append envs_dirs <path to add>
```
- For example, if you've added an environment for Tensorflow in /projects/<project>/<user>/custom/tf, and your environment list shows up as

[user@el3 ~]$ conda env list
# conda environments:
#
base * /nopt/nrel/apps/anaconda/mini_4.9.2
       /projects/<project>/<user>/custom/tf

then you can make sure that environment carries the name "tf" by adding the containing path to your Anaconda configuration.

[user@el3 ~]$ conda config --append envs_dirs /projects/<project>/<user>/custom
[user@el3 ~]$ conda env list
# conda environments:
#
base * /nopt/nrel/apps/anaconda/mini_4.9.2
tf     /projects/<project>/<user>/custom/tf

In order to remove that location when cleaning up, just replace "append" with "remove", i.e.,

[user@el3 ~]$ conda config --remove envs_dirs /projects/<project>/<user>/custom
[user@el3 ~]$ conda env list
# conda environments:
#
base * /nopt/nrel/apps/anaconda/mini_4.9.2
       /projects/<project>/<user>/custom/tf

You can also remove the environment (before dereferencing the path, preferably) via the usual conda env remove syntax given below.

FYI, the actual configuration information is written into ~/.condarc; "envs_dirs" is just a key in that file.

The following example illustrates how to create a new Python 2 environment with numpy, scipy, and pandas in a $SCRATCH location, activate it, deactivate it, and remove it.

[user@el1 ~]$ module purge; module load conda/4.9.2
[user@el1 ~]$ python --version
Python 3.9.4
[user@el1 ~]$ conda create -p /scratch/$USER/exampleenv python=2 numpy scipy pandas
Fetching package metadata .......
Solving package specifications: ..........
<SNIP>
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
# 
To activate this environment, use:
# $ conda activate /scratch/$USER/exampleenv
#
# To deactivate this environment, use:
# $ conda deactivate
#
[user@el1 ~]$ conda env list
# conda environments:
#
<SNIP>
base * /nopt/nrel/apps/anaconda/mini_4.9.2
       /scratch/$USER/exampleenv
[user@el1 ~]$ conda activate /scratch/$USER/exampleenv
(/scratch/$USER/exampleenv) [user@el1 ~]$ type python 
python is /scratch/$USER/exampleenv/bin/python
(/scratch/$USER/exampleenv) [user@el1 ~]$ python --version
Python 2.7.18 :: Anaconda, Inc.
(/scratch/$USER/exampleenv) [user@el1 ~]$ conda deactivate
[user@el1 ~]$ conda env remove -p /scratch/$USER/exampleenv

Some Common Commands and Options

conda create --name <envname> python=3 numpy: Create a custom Python 3 environment named <envname> and install numpy and its dependencies
conda activate <envname> : Activate environment <envname>.
conda search <package> : Look for a package in the conda repo that isn't installed yet.
conda install --name <envname> <package> : Install package <package> into custom environment <envname>. The --name <envname> option isn't required if that environment is already your active one.
conda list [<package>] : List installed packages, or a particular package that's installed.
conda [install | search] -c <channel> <package> : Specify an alternative channel. For example,
- conda search -c conda-forge qutip (Package not in base repo)
- conda search -c r r-bivgeo (Specialized repository for lesser known R packages)
conda env list : See a list of your accessible environments
conda env remove --name <envname> : Delete custom environment <envname>
conda <command> [<sub-command>] --help: Get syntax for Anaconda usage

Using Environment.yml Files

Anaconda supports creating environments from a YAML-formatted file. This allows the same environment to be used wherever the dependent code is run. For example an environment.yml file can be created on the developer's laptop and used on the laptop and the HPC to create the environment that will allow the code to run. The conda module has its own environment.yml that is used to create the root environment. This file can be downloaded and modified to produce the environment needed for custom code.

If you require something not available in the default environment and are unable to create a custom environment, contact us to determine if the default environment can be adapted to your needs.

Python Interactive Shell /Jupyter

Note: Since the CentOS 7 transition, it has become necessary to unset a Jupyter-related variable by hand in order to get the workflow below to succeed. We are working to resolve this, but until we can, please add the following instruction to any job scripts using Jupyter, or interactive workflows before launching Jupyter:

[user@HOSTNAME ~]$ unset XDG_RUNTIME_DIR

Jupyter is supported in the default conda environment. However the HPC firewall prohibits direct connections to the server due to security concerns. An ssh pipe is required to access the notebook server.

Request a compute node as an interactive shell and start the notebook server:

[user@HOSTNAME ~]$ salloc -N 1 -t 60
[user@NODENAME ~]$ module purge; module load conda
[user@NODENAME ~]$ jupyter notebook --no-browser --ip=* (may require --ip=0.0.0.0)
[I 10:00:14.868 NotebookApp] Serving notebooks from local directory: /home/user
[I 10:00:14.868 NotebookApp] 0 active kernels 
[I 10:00:14.869 NotebookApp] The Jupyter Notebook is running at: http://127.0.0.1:8888/?token=somerandomstring
[I 10:00:14.869 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 10:00:14.870 NotebookApp]

   Copy/paste this URL into your browser when you connect for the first time,    to login with a token:
        http://127.0.0.1:8888/?token=somerandomstring

On your local machine start a ssh tunnel:

laptop:~ user$ ssh -L 8888:NODENAME:8888 HOSTNAME
[user@HOSTNAME ~]$

* Replace the HOSTNAME with the login node and NODENAME with compute node

e.g., HOSTNAME=el1.hpc.nrel.gov and NODENAME=r4i2n27

On your local machine start a web browser and go to http://127.0.0.1:8888/?token=somerandomstring, using the token as shown in the notebook startup

See quick reference sheet for iPython.

Python Debugging

Knowing how to use the Python pdb / pdb debugging module can speed development.

See quick reference sheet for pdb.