Enabling Microsoft OneDrive Integration with HTCondor

07/08/2019
by   Derek Weitzel, et al.
0

Accessing data from distributed computing is essential in many workflows, but can be complicated for users of cyberinfrastructure. They must perform multiple steps to make data available to distributed computing using unfamiliar tools. Further, most research on data distribution has focused on the efficiency of providing data to computing resources rather than considering the ease of use for distributing data. Creating an easy to use data distribution method can reduce the time researchers spend learning cyberinfrastructure and increase its usefulness. Microsoft OneDrive is a online storage solution providing both file storage and sharing. OneDrive provides many different clients to access data stored in the service. It provides many features that users of cyberinfrastructure could find useful such as automatic synchronization with desktop clients. A barrier to using services such as OneDrive is the credential management necessary to access the service. Recent innovations in HTCondor have allowed the management of OAuth credentials to be handled by the scheduler on the user's behalf. The user no longer has to copy credentials along with the job, HTCondor will handle the acquisition, renewal, and secure transfer of credentials on the user's behalf. In this paper, I will focus on providing an easy to use data distribution method utilizing Microsoft OneDrive. Measuring ease of use is difficult, therefore I will will describe the features and advantages of using OneDrive. Additionally, I will compare it to measurements of data distribution methods currently used on a national cyberinfastructure, the Open Science Grid.

READ FULL TEXT
research
05/16/2019

StashCache: A Distributed Caching Federation for the Open Science Grid

Data distribution for opportunistic users is challenging as they neither...
research
03/27/2019

The XENON1T Data Distribution and Processing Scheme

The XENON experiment is looking for non-baryonic particle dark matter in...
research
03/13/2023

Integration of storage endpoints into a Rucio data lake, as an activity to prototype a SKA Regional Centres Network

The Square Kilometre Array (SKA) infrastructure will consist of two radi...
research
10/26/2018

Federating distributed storage for clouds in ATLAS

Input data for applications that run in cloud computing centres can be s...
research
05/12/2018

Deploying Jupyter Notebooks at scale on XSEDE for Science Gateways and workshops

Jupyter Notebooks have become a mainstream tool for interactive computin...
research
05/12/2018

Deploying Jupyter Notebooks at scale on XSEDE resources for Science Gateways and workshops

Jupyter Notebooks have become a mainstream tool for interactive computin...
research
08/29/2018

Fair Marketplace for Secure Outsourced Computations

The cloud computing paradigm offers clients ubiquitous and on demand acc...

Please sign up or login with your details

Forgot password? Click here to reset