STAT _________________________________________________________ Table of Contents Overview Dependent Packages Installation _________________________________________________________ Overview STAT, the Stack Trace Analysis Tool, helps isolate bugs by gathering stack traces from each individual process of a parallel application and merging them into a global, yet compact representation. Each stack trace, as depicted in Figure 1, captures the function calling sequence of an individual process. The nodes are labeled with the function names and the directed edges show the function calling sequence from caller to callee. STAT's stack trace merging process forms a call graph prefix tree, which can be seen in Figure 1. The prefix tree groups together traces from different processes that have the same calling sequence and labels the edges with the count and set of tasks that exhibited that calling sequence. Nodes in the prefix tree that are visited by the same set of tasks are given the same color, providing the user with a quick means of identifying the various process equivalence classes. [trace_and_merge.png] Figure 1. A single stack trace (left) and a STAT merged call prefix tree (right) STAT merges stack traces into 2D spatial and 3D spatial-temporal call prefix trees. The 2D spatial call prefix tree (Figure 2) represents a single snapshot of the entire application. The 3D spatial-temporal call prefix tree (Figure 3) takes a series of snapshots from the application over time and is useful for analyzing time-varying behavior. [bgl4k_2d.png] Figure 2. A 2D spatial call prefix tree [bgl4k_3d.png] Figure 3. A 3D spatial call prefix tree Stack traces based on function names only provide a high-level overview of the application's execution. However, for certain bugs this view may be too coarse-grained so STAT is also capable of gathering stack traces with more fine-grained information. In particular, STAT can also record the program counter of each frame or with the appropriate debug information compiled into the application (i.e., with the "-g" compiler flag), STAT can gather the source file and line number of each stack frame. Both of these refinements can further delineate processes and refine the process equivalence classes. In addition, line number information can be fed into a static code analysis engine to derive the logical temporal order of the MPI tasks Figure 4. This analysis traverses from the root of the tree towards the leaves, at each step analyzing the control flow of the source code and sorting sibling nodes by the amount of execution progress made through the code. For straight-line code, this simply means that one task has made more progress if it has executed past the point of another task, i.e., if it has a greater line number. This ordering is partial since two tasks in different branches of an if-else are incomparable. In cases where the program points being compared are within a loop, STAT can extract the loop ordering variable from the application processes and further delineate tasks by execution progress. This analysis is useful for identifying the culprit in a deadlocked or livelocked application, where the problematic task has often either made the least or most progress through the code, leaving the remaining tasks stuck in a barrier or blocked pending a message. Note, this feature is still a prototype. Please contact Greg Lee for an experimental version. [stat_to_example.png] Figure 4. STAT's temporal ordering analysis engine indicates that task 1 has made the least progress. In this example, task 1 is stuck in a compute cycle, while the other tasks are blocked in MPI communication, waiting for task 1. _________________________________________________________ Dependent Packages STAT has several dependencies Table 1. STAT Dependent Packages Package What It Does Package Web Page Graphlib Graph creation, merging, and export https://github.com/lee218llnl/STAT Launchmon Scalable daemon co-location http://sourceforge.net/projects/launchmon/ Libdwarf Debug information parsing (Required by StackWalker) http://reality.sgiweb.org/davea/dwarf.html MRNet Scalable multicast and reduction network http://www.paradyn.org/mrnet/ StackWalker Lightweight stack trace sampling http://www.paradyn.org/html/downloads.html In addition, the STAT GUI requires Python with PyGTK, both of which are commonly preinstalled with many Linux operating systems. The Pygments Python module can optionally be installed to allow the STAT GUI to perform syntax highlighting of source code. _________________________________________________________ Installation First run configure. You will need to use the --with-package options to specify the install prefix for mrnet, graphlib, launchmon, libdwarf, and stackwalker. These options will add the necessary includes and library search paths to the compile options. Refer to configure --help for exact options. You may also wish to specify the maximum number of communication processes to launch per node with the option --with-procspernode=number, generally set to the number of cores per node. STAT creates wrapper scripts for the stat-cl command line and stat-gui commands. These wrappers set appropriate paths for the launchmon and mrnet_commnode executables, based on the the --with-launchmon and --with-mrnet configure options, thus it is important to specify both of these even if they share a prefix. STAT will use StackWalker by default. However, it can use Dyninst instead if you specify --with-dyninst to the configure script. To enable the building of the STAT GUI, configure with --enable-gui and also specify --with-python-include-dir=dir, where dir contains Python.h. On BlueGene systems, also be sure to configure --with-bluegene. This will enable the BGL macro for BlueGene specific compilation. Similarly, to compile on Cray systems running alps, specify --with-cray-alps. An example configure line for Cray with alps: ./configure --with-launchmon=/tmp/work/lee218/install \ --with-mrnet=/tmp/work/lee218/install \ --with-graphlib==/tmp/work/lee218/install \ --with-stackwalker=/tmp/work/lee218/install \ --with-libdwarf=/tmp/work/lee218/install \ --prefix=/tmp/work/lee218/install --with-cray-alps \ MPICC=cc MPICXX=CC MPIF77=ftn --enable-shared LD=/usr/bin/ld.x Next you just need to run: make make install Note that STAT hardcodes the paths to its daemon and filter shared object, assuming that they are in $prefix/bin and $prefix/lib respectively, thus testing should be done in the install prefix after running "make install" and the installation directory should not be moved. The path to these components can, however, be overridden with the --daemon and --filter arguments. Further, the STAT_PREFIX environment variable can be defined to override the hardcoded paths in STAT.