AI for Datacenter Optimization (ADOPT'22)

In conjunction with the IEEE International Parallel and Distributed Processing Symposium

Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to a new focus on approaches to optimized resource usage, allocations and deployment of new AI frameworks. With these changes, there is a need to better understand HPC/cloud/datacenter operations with the goal of developing improved scheduling policies, identifying inefficiencies in resource utilization, energy/power consumption, failure prediction, as well as identifying policy violations. Simultaneously, there is a growing interest in addressing the increasing power requirements for AI training and inference operations in HPC/Cloud environments with the goal of minimizing the accompanying effects on climate change. There are several publicly available datasets such as the Argonne Leadership Computing Facility dataset, Blue Waters System Monitoring dataset, Philly traces by Microsoft Inc., the Google Cluster Usage traces and the Atlas cluster trace repository that can enable this research. In addition, the recently released MIT Supercloud Dataset includes monitoring logs from the MIT Supercloud system which include time series of CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data. This workshop will focus on AI/ML approaches to datacenter operations, power/energy modeling, scheduling, optimization and monitoring tools. In addition to these topics, this workshop solicits papers that address any of the challenges in enabling large-scale AI in HPC/Cloud environments, including:

  • Operational insights from deploying and scaling AI/ML workloads in shared systems
  • Parallelization strategies for AI training
  • Optimization strategies including algorithmic and hardware innovations for faster training and inference 
  • Scheduling strategies for AI and traditional HPC workloads in large scale environments
  • Approaches and instrumentation for measurement of the climate impact of AI

Submission deadlines: 

  • Paper submission deadline extended - February 28, 2022
  • Author notification - March 17, 2022
  • Camera ready deadline - March 21, 2022 

Submission Details: Submissions will be evaluated in a single blind process. Papers must not exceed 8 single-spaced, double-column pages using 10-point font on 8.5x11 inch pages  (excluding references). Manuscripts must use IEEE Conference style available here:  https://www.ieee.org/conferences/publishing/templates.html . Submitted papers must not be under review at another conference, journal or workshop. One of the authors of accepted papers is expected to present at the workshop.

Link for paper submissions : https://easychair.org/conferences/?conf=adopt22 

Program Chair

  • Stephanie Brink, LLNL

Program Committee

  • Sudheer Chunduri, ANL
  • Harshitha Gopalakrishnan, LLNL
  • Ivy Peng, LLNL
  • Oral Sarp, ORNL
  • Albert Reuther, MIT 
  • Prof. Devesh Tiwari, Northeastern University
  • Karen Tomko, OSC
  • Feiyi Wang, LLNL
  • Shin Woong, ORNL
  • Siddharth Samsi, MIT

ADOPT'22 Accepted Papers

  • When and How to Retrain Machine Learning-based Cloud Management Systems - Lidia Kidane, Paul Townend, Thijs Metsch and Erik Elmroth
  • Scalable Data Parallel Distributed Training for Graph Neural Networks - Sohei Koyama and Osamu Tatebe
  • The MIT Supercloud Workload Classification Challenge - Benny J. Tang,Qiqi Chen,Matthew L. Weiss, et. al.
  • Loss Curve Approximations for Fast Neural Architecture Ranking & Training Elasticity Estimation - Dan Zhao, Vijay Gadepally, Siddharth Samsi, et. al.
  • Characterizing Multi-Instance GPU for Machine Learning Workloads - Baolin Li, Vijay Gadepally, Siddharth Samsi and Devesh Tiwari
  • Energy-aware neural architecture selection and hyperparameter optimization - Nathan Frey, Dan Zhao, Simon Axelrod, Michael Jones, David Bestor, Vijay Gadepally, Rafael Gómez-Bombarelli and Siddharth Samsi
  • A Green(er) World for A.I. - Dan Zhao,Nathan C. Frey,Joseph McDonald,Matthew Hubbell,David Bestor,Michael Jones,Andrew Prout,Vijay Gadepally,Siddharth Samsi