From Bare Metal to Virtual: Lessons Learned when a Supercomputing Institute Deploys its First Cloud

07/23/2018
by   Evan F. Bollig, et al.
0

As primary provider for research computing services at the University of Minnesota, the Minnesota Supercomputing Institute (MSI) has long been responsible for serving the needs of a user-base numbering in the thousands. In recent years, MSI---like many other HPC centers---has observed a growing need for self-service, on-demand, data-intensive research, as well as the emergence of many new controlled-access datasets for research purposes. In light of this, MSI constructed a new on-premise cloud service, named Stratus, which is architected from the ground up to easily satisfy data-use agreements and fill four gaps left by traditional HPC. The resulting OpenStack cloud, constructed from HPC-specific compute nodes and backed by Ceph storage, is designed to fully comply with controls set forth by the NIH Genomic Data Sharing Policy. Herein, we present twelve lessons learned during the ambitious sprint to take Stratus from inception and into production in less than 18 months. Important, and often overlooked, components of this timeline included the development of new leadership roles, staff and user training, and user support documentation. Along the way, the lessons learned extended well beyond the technical challenges often associated with acquiring, configuring, and maintaining large-scale systems.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset