ADOPT'22

The ADOPT Workshop was held on May 30th, 2022 in conjunction with the IEEE International Parallel and Distributed Processing Symposium.

Workshop program is below : 

10am - 10:05am - Opening remarks

10:10 - 10:50 - Keynote (Dr. Neil Thompson, MIT)

11:00 - 1:30pm - Contributed papers (in order of presentation)

  • When and How to Retrain Machine Learning-based Cloud Management Systems - Lidia Kidane, Paul Townend, Thijs Metsch and Erik Elmroth
  • Scalable Data Parallel Distributed Training for Graph Neural Networks - Sohei Koyama and Osamu Tatebe
  • The MIT Supercloud Workload Classification Challenge - Benny J. Tang,Qiqi Chen,Matthew L. Weiss, et. al.
  • Loss Curve Approximations for Fast Neural Architecture Ranking & Training Elasticity Estimation - Dan Zhao, Vijay Gadepally, Siddharth Samsi, et. al.
  • Characterizing Multi-Instance GPU for Machine Learning Workloads - Baolin Li, Vijay Gadepally, Siddharth Samsi and Devesh Tiwari
  • Energy-aware neural architecture selection and hyperparameter optimization - Nathan Frey, Dan Zhao, Simon Axelrod, Michael Jones, David Bestor, Vijay Gadepally, Rafael Gómez-Bombarelli and Siddharth Samsi
  • A Green(er) World for A.I. - Dan Zhao,Nathan C. Frey,Joseph McDonald,Matthew Hubbell,David Bestor,Michael Jones,Andrew Prout,Vijay Gadepally,Siddharth Samsi

 

Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to a new focus on approaches to optimized resource usage, allocations and deployment of new AI frameworks. With these changes, there is a need to better understand HPC/cloud/datacenter operations with the goal of developing improved scheduling policies, identifying inefficiencies in resource utilization, energy/power consumption, failure prediction, as well as identifying policy violations. Simultaneously, there is a growing interest in addressing the increasing power requirements for AI training and inference operations in HPC/Cloud environments with the goal of minimizing the accompanying effects on climate change. There are several publicly available datasets such as the Argonne Leadership Computing Facility dataset, Blue Waters System Monitoring dataset, Philly traces by Microsoft Inc., the Google Cluster Usage traces and the Atlas cluster trace repository that can enable this research. In addition, the recently released MIT Supercloud Dataset includes monitoring logs from the MIT Supercloud system which include time series of CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data. This workshop will focus on AI/ML approaches to datacenter operations, power/energy modeling, scheduling, optimization and monitoring tools. In addition to these topics, this workshop solicits papers that address any of the challenges in enabling large-scale AI in HPC/Cloud environments, including:

Operational insights from deploying and scaling AI/ML workloads in shared systems
Parallelization strategies for AI training
Optimization strategies including algorithmic and hardware innovations for faster training and inference 
Scheduling strategies for AI and traditional HPC workloads in large scale environments
Approaches and instrumentation for measurement of the climate impact of AI
Submission deadlines: 

Paper submission deadline extended - February 28, 2022
Author notification - March 17, 2022
Camera ready deadline - March 21, 2022 
Submission Details: Submissions will be evaluated in a single blind process. Papers must not exceed 8 single-spaced, double-column pages using 10-point font on 8.5x11 inch pages  (excluding references). Manuscripts must use IEEE Conference style available here:  https://www.ieee.org/conferences/publishing/templates.html . Submitted papers must not be under review at another conference, journal or workshop. One of the authors of accepted papers is expected to present at the workshop.

Link for paper submissions : https://easychair.org/conferences/?conf=adopt22 

Program Chair

Stephanie Brink, LLNL
Program Committee

Sudheer Chunduri, ANL
Harshitha Gopalakrishnan, LLNL
Ivy Peng, LLNL
Oral Sarp, ORNL
Albert Reuther, MIT 
Prof. Devesh Tiwari, Northeastern University
Karen Tomko, OSC
Feiyi Wang, LLNL
Shin Woong, ORNL
Siddharth Samsi, MIT