We'll describe some sources of information you can access from within a job context, and show how they may offer insights into your job's behavior. The scope of this talk is relatively simple, DIY-type data collection, rather than more complicated profiling frameworks. For example, our systems expose power counters which your programs can query, if you're interested in carbon footprint. There are many other "gauges" and counters that can provide information about memory use, cache misses, sources of slowdowns, etc.
_______________________________________________
This webinar was presented by Mark Hahn (SHARCNET) on July 26th, 2023, as a part of a series of weekly Compute Ontario Colloquia. The webinar was hosted by SHARCNET. The colloquia cover different advanced research computing (ARC) and high performance computing (HPC) topics, are approximately 45 minutes in length, and are delivered by experts in the relevant fields. Further details can be found on this web page: https://www.computeontario.ca/trainin... . Recordings, slides, and other materials can be found here: https://helpwiki.sharcnet.ca/wiki/Onl...
SHARCNET is a consortium of 19 Canadian academic institutions who share a network of high performance computers (http://www.sharcnet.ca). SHARCNET is a part of Compute Ontario (http://computeontario.ca/) and Digital Research Alliance of Canada (https://alliancecan.ca).