### OVERVIEW  
**Intel(R) VTune(TM) Fabric Profiler  
VERSION 2.0.0  
ONLY FOR EVALUATION USE**

0. Introduction
1. Data collector
2. Analyzer
3. Example applications
4. Trace files
5. The fpro script
6. Known & Typical issues

## **0. Introduction**

The Fabric Profiler is a tool which can be used to identify detailed fabric response information that a given Intel® SHMEM or OpenSHMEM application places on it.   
It is composed of two parts:

The "**data collector**" monitors application and node-local cluster injection/ejction behavior while the application is running.  It outputs a set of trace files, which are collected and merged across the cluster once the user job completes.  These trace files represent runtime data collected by the profiler to include SHMEM call tracing and cluster network counter data, timestamped and time-base syncronized.  They are binary files which are intended to be consumed by the analyzer.

The "**analyzer**" is a collection of MATLAB tools that runs on a Linux workstation after the application has completed. These tools display profiling results with interactive features that allow exploring a multitude of communication-centric behaviors.

It should be noted that the Fabric Profiler tools are *prototypes* with a goal of determining what information is useful to developers, and how best to display the sometimes complex relationship between workload and fabric activity.

There are several example applications you can run, and also some pre-baked trace files you can explore if you want to use the analyzer without running an application with the data collector first.  It is recommended to start with the 'sanity' examples, and to then take a look at the examples which ship with Intel® SHMEM v1.0.0.

**This is a binary tool release.**  

Supported clusters used to develop the profiler were based on [Intel SPR w/Intel® Data Center GPU Max Series XeLink cluster + HPE Slingshot11] fabrics.

## **1. Data collector**
The **data collector** operates in one of two modes:

*  **OpenSHMEM-only.** This mode of operation includes the features included in the original version of this tool, which includes full OpenSHMEM API coverage.
   - This mode assumes that standard OpenSHMEM API *shmem_init() / shmem_finalize()*  API are used to initialize/finalize the library.
   - No Intel® SHMEM API will be tracked in this mode.
   - This release was built against Sandia Open SHMEM v1.5.2 (SoS) as specified by Intel® SHMEM v1.0.0.
 
*  **Intel® SHMEM.** This mode of operation adds in profiling capability of the Intel® SHMEM Host API, a subset of the overal API.
   - This release supports host-only Intel® SHMEM v1.0.0 calls.  No GPU-resident profiling is supported with this release of the profiler.  
		 - As such no GPU-resident API are currently monitored.
   - This mode assumes that Intel® SHMEM API are used to initialize and finalize the library.
   - This mode supports the same set of OpenSHMEM API as OpenSHMEM-only mode, but adds in the initial support of Intel® SHMEM.


The collector's mode of operation is controlled under the hood by selecting one of two different shared object files under *$ESP_ROOT/bin/collector/ishmem_shmem and $ESP_ROOT/bin/collector/shmem_only*. The *fpro* job launch script, mentioned later, is the recommended starting place to help you put together a working PBS script which selects the proper collector version. If you investigate the FPRO-generated PBS script, you will see the specification of which Fabric Profiler library to use (libesp.so).

The data collector is implemented as a shared object library that intercepts the application's Intel® SHMEM and/or OpenSHMEM calls and monitors fabric activity in the background.  The tool is intended to enable the parallel application developer interested in understanding the relationship between their code and fabric response.

## **2. Analyzer**
The analyzer is located in the release directory in *$ESP_ROOT/bin/analyzer*. It is a MATLAB program named "*fpro_analyzer*".   
**The free MATLAB Runtime Environment (R2021b) is required.**
See README.ANALYZER.md for information on how to download the MATLAB Runtime Environment and use the analyzers to visualize trace files generated at application run-time.

## **3. Example applications**

The example applications are found in the release directory in *$ESP_ROOT/examples*.
There are two classes of examples, each mapping to the collector mode of operations:  OpenSHMEM-only and Intel® SHMEM.

***OpenSHMEM-only (legacy):***  
**- ISx** - This is OpenSHMEM-only integer sort obtained from [ISx github](https://github.com/ParRes/ISx).  
**- sanity** - This OpenSHMEM-only application sends data from one PE to the next, using the CPUs, and is a good place to start to ensure the cluster and node(s) are working as expected.  

***Intel® SHMEM:***  
**- sycl_sanity** - This Intel® SHMEM application requires at least 2 PEs to participate. It sends data from one PE to the next, using the PVC GPUs. Each PE sends and receives the same amount of data, so this is a good sanity check that things are working correctly on the GPU-side.  

See the README file in each of the example subdirectory for steps to build and run these applications.
Sample fpro strings are given which should get you going quickly.

## **4. Sample Trace files**

Directory *$ESP_ROOT/examples/sampletraces* contains a compressed example output of a run of 'sanity'
on a small Slingshot cluster, including merged trace files.  This is useful if you would like to try the analyzer without running an instrumented application first. 

See README.ANALYZER.md for help in using the analyzer.  The analyzers will prompt you for input trace files.  Decompress and un-tar the example traces and point it to the directory created.

## **5. The fpro script**
The fpro helper script can be used to run a profiled parallel job on the cluster. It uses enviornment variables (below) as input to create a suitable launch script for the selected job management system.  By default, PBS is assumed.  

The fpro script can be found at *$ESP_ROOT/collector/bin*.  There is a rudimentary help system, which can be accessed by running fpro with no arguments, which explains the available cmdline switches.  Of particular note:

- $ESP_ROOT must be defined prior to using fpro. 
- -j (PBS | SLURM): job scheduler select (PBS for Slingshot-based clusters, SLURM for others). PBS is the default and optionally takes a mating ***-r*** to specify a reservation number.
- -l (0 | 1): Legacy OpenSHMEM mode select.  When 0, the profiler is configured for SoS-based profiling. When 1, it's configured for Intel® SHMEM.
- -n: The number of nodes to run on. This assumes you have created any requiste reservation.
- -p: The number of PEs per node.  

**Envt vars**:  
There are a set of environment variables that need to be dealt with. It's recommended once you settle in, script these for yourself. Refer to setMyVars.sh as a starting point.  
    - **ESP_ROOT**: This is used all over, so set it up first. It should point to the root of where you decompressed Fabric Profiler.  
    - **ESP_TRACE_PATH**: Please define a location where FabPro should deposit its tracefiles when done.  
    - **ESP_WORK**: This location is where the profiler will do it's work, including the creation of a tmp directory. It should not be the same as ESP_TRACE_PATH.  
    - **ESP_SHMEM_ROOT**: It should point to the OpenSHMEM **install** directory root (tested with SoS).  
    - **ESP_ISHM_INSTALL**: Please define the path to your Intel SHMEM installation.  
    - **PIN_ROOT**: Please define the path to the Pin installation root (ex: /mnt/$USER/pin/pin-3.28-98749-g6643ecee5-gcc-linux).  
    - *Refer to $ESP_ROOT/config/VARS.sh, where these and many other environment variables are managed.  Many can be changed externally via 'export', but some require editing here depending on your needs.*  
	 

**fpro** defaults to PBS usage, but can be changed via cmdline switch.  Here are some example usages:
- **./fpro -j pbs -r "R1234" -n 1 -p 2 -l 0 $ESP_ROOT/examples/iSHMEM/sycl_sanity/sycl_sanity**
	- This creates a PBS script- using reservation R1234 with one node, 2 PEs per node, submits the job script, and optionally displays an output banner and progress info.  
	- As mentioned previously, note the selection of the data collector mode, via "***-l 0*** (elle)". This informs the fpro script that the user will be using Intel® SHMEM mode.  
- **./fpro -j pbs -n 1 -p 2 -l 0 $ESP_ROOT/examples/iSHMEM/sycl_sanity/sycl_sanity**
	- This is the same, except it creates a reservationless PBS script. 
- **./fpro -j slurm -n 1 -p 2 -l 0 $ESP_ROOT/examples/iSHMEM/sycl_sanity/sycl_sanity**
	- Selects SLURM instead of PBS, resulting in the creation and submission of a .slurm file. 
  
  **A successful run of the profiler would look something like this:**  
  
    \==========================  
    fpro [Begin]  
     [Host Name: myHost]  
     [Workload: sycl_sanity ]  
     [#Nodes: 1]  
     [#PEs per node: 6]  
     [Job Scheduler: pbs]  
     [Temp Work dir: /mnt/scratch/user/tmp/sycl_sanity.1703174823161]  
    \==========================  

    \====  
    fpro [Generating PBS script]  
    \====   
    \====  
    fpro [running sycl_sanity in job 79547.myHost-pbs1]  
    \====  
    \====  
    fpro [job complete]  
    \====  
    \====  
    fpro [post-processing trace files]  
    INFO: mergeFuncFile complete!  
    INFO: mergeProfileFile complete!  
    INFO: mergePutFile complete!  
    \====  

## **6. Known and Typical Issues**
Known Issues:
* CXI counters are being updated at 1Hz by the CXI driver. This may change in the future as systems mature, but is not related to the profiler's sampling rate.
  
(Collector) Here are a few things we run into sometimes:
* ESP_ROOT or other envt. var(s) not set correctly.  For example, using $PWD or relative paths.
* LD_LIBRARY_PATH not set correctly for dependent libraries.
* PBS script requires tweaking to account for a bad path, bad reservation, bug, etc.
* Cluster state not as needed or expected.
