cuDF is a GPU DataFrame library in Python. It provides a Pandas-like API with accelerated performance for DataFrame operations on a single GPU. However, dealing with large datasets is limited by the memory available on a single GPU. Since Dask provides a framework for scalable computing, Dask-cuDF integrates cuDF with Dask to allow scaling a large DataFrame workload across multiple GPUs. This webinar introduces Dask-cuDF with demo examples on a multi-GPU node on the national clusters.
_______________________________________________
This webinar was presented by Jinhui Qin (SHARCNET) on February 22nd, 2023, as a part of a series of weekly Compute Ontario Colloquia. The webinar was hosted by SHARCNET. The colloquia cover different advanced research computing (ARC) and high performance computing (HPC) topics, are approximately 45 minutes in length, and are delivered by experts in the relevant fields. Further details can be found on this web page: https://www.computeontario.ca/trainin... . Recordings, slides, and other materials can be found here: https://helpwiki.sharcnet.ca/wiki/Onl...
SHARCNET is a consortium of 19 Canadian academic institutions who share a network of high performance computers (http://www.sharcnet.ca). SHARCNET is a part of Compute Ontario (http://computeontario.ca/) and Digital Research Alliance of Canada (https://alliancecan.ca).