Skip to main content

Intel Parallel Studio on the Peregrine System

Intel Parallel studio is a set of tools that enable developing and optimizing software for the latest processor architectures.

Some of the tools available as part of Intel Parallel Studio on Peregrine include:

Intel VTune Amplifier XE is a performance profiler for C, C++, C#, Fortran, Assembly and Java code. Hotspots analysis provides a sorted list of functions that use a lot of CPU time and other features enable the user to quickly find common causes of slow performance in parallel programs, including waiting too long at locks and load imbalance among threads and processes. VTune Amplifier XE uses the Performance Monitoring Unit (PMU) on Intel processors to collect data with very low overhead.

The recommended way to use this tool on Peregrine, is to run the profiler from the command line and view the data using the GUI or generate a text report from the command line. 

You can list all the available profiling options for the machine you're profiling on, from the GUI or from the command line using amplxe-cl -collect-list.

Example batch script to get a HPC- characterization profile is given below:

#!/bin/bash --login
#PBS -N <job name>
#PBS -q <queue>
#PBS -l nodes=<N>:ppn=<n>
#PBS -l walltime=00:30:00
#PBS -A <Allocation handle>
# set your tmpdir, and don't forget to clean it after your job
# completes.
export TMPDIR=/scratch/$USER/tmp
# load the module to use Vtune
module load comp-intel/2017.0.5
. /nopt/intel/17.0.5/parallel_studio_xe_2017.5.061/psxevars.sh

# profile the executable
amplxe-cl --collect hpc-performance ./executable.exe

GUI

amplxe-gui

Intel Trace Analyzer and Collector is a tool for understanding the behavior of MPI applications. Use this tool to visualize and understand MPI parallel application behavior, evaluate load balancing, learn more about communication patterns and identify communication hotspots.

The recommended way to use this tool on Peregrine, is to collect data from the command line and view the data using the GUI.

Example batch script to collect MPI communication data:

#!/bin/bash --login
#PBS -N <job name>
#PBS -q <queue>
#PBS -l nodes=<N>:ppn=<n>
#PBS -l walltime=00:30:00
#PBS -A <Allocation handle>
# set your tmpdir, and don't forget to clean it after your job
# completes.
export TMPDIR=/scratch/$USER/tmp
# load the module to use Traceanalyzer
module load impi-intel/2017.0.5
module load comp-intel/2017.0.5
. /nopt/intel/17.0.5/parallel_studio_xe_2017.5.061/psxevars.sh

# to profile the executable, just append '-trace' to mpirun
mpirun -trace -n 4 ./executable.exe
# this generates a .stf file that can viewed using the GUI

GUI

traceanalyzer

Intel Advisor helps with vectorization and threading in your , C++ and Fortran Applications. This tool helps identify areas that would benefit the most from vectorization and helps with identifying what is blocking vectorization and gives insights to overcome it.

# module to load
module load comp-intel/2017.0.5
. /nopt/intel/17.0.5/parallel_studio_xe_2017.5.061/psxevars.sh

# set your tmpdir, and don't forget to clean it after your job
# completes.
export TMPDIR=/scratch/$USER/tmp

You can list all the available profiling options for the machine you're profiling on, from the GUI or from the command line using:

advixe-cl -collect-list

This tool has a lot of features that can be accessed from the GUI,

GUI

advixe-gui

Intel Inspector XE is an easy to use memory checker and thread checker for serial and parallel applications written in C, C++, C#, F# and Fortran. It takes you to the source locations of threading and memory errors and provides a call stack to help you determine how you got there. This tool has a GUI and a command line interface.

# module to load  
module load comp-intel/2017.0.5
. /nopt/intel/17.0.5/parallel_studio_xe_2017.5.061/psxevars.sh

# set your tmpdir, and don't forget to clean it after your job
# completes.
export TMPDIR=/scratch/$USER/tmp

You can list all the available profiling options for the machine you're running this tool on, from the GUI or from the command line using:

inspxe-cl -collect-list

This tool has a lot of features, that can be accessed from the GUI,

GUI

inspxe-gui

The new Application Performance Snapshot merges the earlier MPI Performance Snapshot and Application Performance Snapshot Tech Preview. MPI Performance Snapshot is no longer available separately, but all of its capabilities and more are available in the new combined snapshot. This tool lets you take a quick look at your application's performance to see if it is well optimized for modern hardware. It also includes recommendations for further analysis if you need more in-depth information.

Using this tool on Peregrine:

$. /nopt/nrel/apps/IntelAPS/APS_2018_lin_523034/apsvars.sh
# serial/SMP executable
$ aps <executable> # this generates an aps result directory
# DMP executable
$ mpirun -n 4 aps <executable>
# this generates an aps result directory # to gerate text and /hmtl result files:
$ aps --report=<the generated results directory from the previous step>
# the result file can be viewed in a browser or text editor

Before you being please make sure that your application is compiled with the debug flag (-g), to enable profiling and debugging.

When using the suite of tools from Intel Parallel Studio on Peregrine, we recommend that you set your TMPDIR to point to a location in your SCRATCH directory.

export TMPDIR=/scratch/$USER/tmp

Important : Please make sure that you cleanup this directory after your job completes.