Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management

08/22/2017
by   Mangesh Bendre, et al.
0

Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make a first step towards this vision: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 20 storage, and up to 50

READ FULL TEXT

page 4

page 5

page 12

page 13

research
08/15/2020

Automatic Storage Structure Selection for hybrid Workload

In the use of database systems, the design of the storage engine and dat...
research
07/21/2023

A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack

With the ever-increasing amount of data generate in the world, estimated...
research
02/25/2023

TS-Cabinet: Hierarchical Storage for Cloud-Edge-End Time-series Database

Hierarchical data storage is crucial for cloud-edge-end time-series data...
research
04/18/2002

Trust Brokerage Systems for the Internet

This thesis addresses the problem of providing trusted individuals with ...
research
11/27/2018

AstroServ: Distributed Database for Serving Large-Scale Full Life-Cycle Astronomical Data

In time-domain astronomy, STLF (Short-Timescale and Large Field-of-view)...
research
07/26/2023

GovernR: Provenance and Confidentiality Guarantees In Research Data Repositories

We propose cryptographic protocols to incorporate time provenance guaran...
research
07/03/2017

Version 0.1 of the BigDAWG Polystore System

A polystore system is a database management system (DBMS) composed of in...

Please sign up or login with your details

Forgot password? Click here to reset