Skip to main content
Ctrl+K
HPCToolkit - Home
  • HPCViewer
  • Download as EPUB
  • Download as PDF
  • Introduction
  • HPCToolkit Overview
  • Quick Start
  • Installing HPCToolkit with Spack
  • Building from Source with Meson
  • Effective Strategies for Analyzing Program Performance
  • Monitoring Dynamically-linked Applications with hpcrun
  • Monitoring MPI Applications
  • Measurement and Analysis of GPU-accelerated Applications
  • Hpcviewer
  • Known Issues
  • FAQ and Troubleshooting
  • Environment Variables
  • Getting Help
  • Show source
  • Suggest edit
  • Open issue
  • .md

  • Introduction
  • HPCToolkit Overview
    • Asynchronous Sampling and Call Path Profiling
    • Recovering Static Program Structure
    • Aggregating and Attributing Performance Measurements
    • Presenting Performance Measurements
  • Quick Start
    • Guided Tour
      • Compiling an Application
      • Measuring Application Performance
        • Specifying CPU Sample Sources
        • Measuring GPU Computations
      • Recovering Program Structure
        • Caching Structure Results
      • Attributing Measurements to Source Code
      • Presenting Performance Measurements for Interactive Analysis
      • Effective Performance Analysis Techniques
    • Additional Guidance
  • Installing HPCToolkit with Spack
    • Config Files
      • Config.yaml
      • Modules.yaml
    • Installing a Basic HPCToolkit
    • Installing Hpcviewer
    • Configuration Options
      • CUDA (+cuda)
      • Level Zero (+level_zero)
      • ROCm (+rocm)
      • OpenCL (+opencl)
      • MPI (+mpi)
      • PAPI vs Perfmon (+papi)
      • Python (+python)
    • Running Spack for the First Time
  • Building from Source with Meson
    • Quickstart
    • Configuration
    • Installing Dependencies without Root
      • Meson Wraps
      • Dev Containers (BETA)
    • Installing Dependencies as Root
      • Debian/Ubuntu and Derivatives
      • Fedora/RHEL and Derivatives
      • SUSE Leap/SLES 15 and Derivatives
    • Custom Dependencies
    • Meson Documentation References
  • Effective Strategies for Analyzing Program Performance
    • Monitoring High-Latency Penalty Events
    • Computing Derived Metrics
    • Pinpointing and Quantifying Inefficiencies
    • Pinpointing and Quantifying Scalability Bottlenecks
      • Scalability Analysis Using Expectations
        • Weak Scaling
        • Exploring Scaling Losses
  • Monitoring Dynamically-linked Applications with hpcrun
    • Using hpcrun
      • If hpcrun causes your application to fail
        • hpcrun causes failures related to loading or using shared libraries
        • hpcrun causes your application to fail when gprof instrumentation is present
    • Hardware Counter Event Names
    • Sample Sources
      • Linux perf_events
        • Capabilities of HPCToolkit’s perf_events Interface
          • Frequency-based sampling.
          • Multiplexing.
          • Thread blocking.
        • Launching
        • Notes
      • PAPI
        • Proxy Sampling
      • REALTIME and CPUTIME
      • IO
      • MEMLEAK
    • Experimental Python Support
      • Known Limitations
    • Process Fraction
    • API to Start and Stop Sampling
    • Environment Variables for hpcrun
    • Cray System Specific Notes
  • Monitoring MPI Applications
    • Running and Analyzing MPI Programs
    • Building and Installing HPCToolkit
  • Measurement and Analysis of GPU-accelerated Applications
    • GPU Performance Measurement Substrate
      • Profiling GPU Activities
      • Tracing GPU Activities
    • NVIDIA GPUs
      • Performance Measurement of CUDA Programs
      • PC Sampling on NVIDIA GPUs
      • Attributing Measurements to Source Code for NVIDIA GPUs
      • GPU Calling Context Tree Reconstruction
    • AMD GPUs
      • PC Sampling on AMD GPUs
      • Hardware Counters on AMD GPUs
    • Intel GPUs
    • Performance Measurement of OpenCL Programs
  • Measurement and Analysis of OpenMP Multithreading
    • Monitoring OpenMP on the Host
    • Monitoring OpenMP Offloading on GPUs
      • NVIDIA GPUs
      • AMD GPUs
      • Intel GPUs
  • Hpcviewer
  • Overview
    • Downloading
    • Building from Source
    • Launching
    • Menus
      • File
      • Filter
      • View
      • Help
    • Limitations
  • Profile View
    • Panes
      • Source Pane
      • Navigation Pane
        • Control Panel
        • Context menus
      • Metric Pane
    • Understanding Metrics
      • How Metrics are Computed
      • Example
    • Derived Metrics
      • Formulae
      • Examples
      • Creating Derived Metrics
    • Metrics in Execution-context level
      • Plot Graphs
      • Thread View
    • Filtering Tree Nodes
    • Convenience Features
      • Source Code Pane
      • Metric Pane
  • Trace view
    • Action and Information Pane
    • Customizing the Color Map
    • Filtering Execution Contexts
      • Filtering Suggestions
  • Accessing Remote Databases
    • Building and Installing hpcserver
    • Opening a Remote Database
  • Known Issues
    • No support for CUDA 13
    • Using Level Zero, time may be observed as non-monotonic
    • When monitoring applications that use ROCm using LD_AUDIT in hpcrun may cause it to fail to elide OpenMP runtime frames
    • When using Intel GPUs, hpcrun may report that substantial time is spent in a partial call path consisting of only an unknown procedure
    • hpcrun reports partial call paths for code executed by a constructor prior to entering main
    • hpcrun may fail to measure a program execution on a CPU with hardware performance counters
    • hpcrun may associate several profiles and traces with rank 0, thread 0
    • hpcrun sometimes enables writing of read-only data
    • A confusing label for GPU theoretical occupancy
  • FAQ and Troubleshooting
    • General Measurement Failures
      • Profiling setuid programs
      • Problems loading dynamic libraries
      • Problems caused by gprof instrumentation
    • Measurement Failures using NVIDIA GPUs
      • Deadlock while monitoring a program that uses IBM Spectrum MPI and NVIDIA GPUs
      • Ensuring permission to use GPU performance counters
      • Avoiding the error cudaErrorUnknown
      • Avoiding the error CUPTI_ERROR_NOT_INITIALIZED
      • Avoiding the error CUPTI_ERROR_HARDWARE_BUSY
      • Avoiding the error CUPTI_ERROR_UNKNOWN
    • General Measurement Issues
      • How do I choose sampling periods?
      • Why do I see partial unwinds?
      • Measurement with HPCToolkit has high overhead! Why?
      • Some of my syscalls return EINTR
      • My application spends a lot of time in C library functions with names that include mcount
    • Problems Recovering Loops in NVIDIA GPU binaries
    • Graphical User Interface Issues
      • hpcviewer fails to launch
      • Fail to run hpcviewer: executable launcher was unable to locate its companion shared library
      • Launching hpcviewer is very slow on Windows
      • Mac only: hpcviewer runs on Java X instead of “Java 17”
      • When executing hpcviewer, it complains cannot create “Java Virtual Machine”
      • hpcviewer fails to launch due to java.lang.NoSuchMethodError exception.
      • hpcviewer fails due to java.lang.OutOfMemoryError exception.
      • hpcviewer writes a long list of Java error messages to the terminal!
      • hpcviewer attributes performance information only to functions and not to source code loops and lines! Why?
      • hpcviewer hangs trying to open a large database! Why?
      • hpcviewer runs glacially slowly! Why?
      • hpcviewer does not show my source code! Why?
        • An explanation how HPCToolkit finds source files
      • hpcviewer’s reported line numbers do not exactly correspond to what I see in my source code! Why?
      • hpcviewer claims that there are several calls to a function within a particular source code scope, but my source code only has one! Why?
      • hpcviewer’s Trace view shows lots of white space on the left. Why?
    • Debugging
      • How do I debug HPCToolkit’s measurement?
      • Tracing HPCToolkit’s Measurement Subsystem
      • Using a debugger to inspect an execution being monitored by HPCToolkit
  • Environment Variables
    • Environment Variables for Users
    • Environment Variables that May Avoid a Crash
    • Environment Variables for Developers
  • Getting Help

previous

HPCToolkit User Manual

next

Introduction

By The HPCToolkit Developers

© Copyright HPCToolkit Project a Series of LF Projects, LLC.