Award Abstract # 1703560
CSR: Medium: Next-Generation Cloud Federation via a Geo-Distributed Datastore

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF CALIFORNIA, SANTA BARBARA
Initial Amendment Date: May 15, 2017
Latest Amendment Date: July 23, 2020
Award Number: 1703560
Award Instrument: Continuing Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2017
End Date: June 30, 2023 (Estimated)
Total Intended Award Amount: $1,175,545.00
Total Awarded Amount to Date: $1,183,545.00
Funds Obligated to Date: FY 2017 = $575,697.00
FY 2018 = $8,000.00

FY 2019 = $295,164.00

FY 2020 = $304,684.00
History of Investigator:
  • Chandra Krintz (Principal Investigator)
    ckrintz@cs.ucsb.edu
  • Divyakant Agrawal (Co-Principal Investigator)
  • Amr El Abbadi (Co-Principal Investigator)
  • Richard Wolski (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-Santa Barbara
3227 CHEADLE HALL
SANTA BARBARA
CA  US  93106-0001
(805)893-4188
Sponsor Congressional District: 24
Primary Place of Performance: University of California - Santa Barbara
Santa Barbara
CA  US  93106-5110
Primary Place of Performance
Congressional District:
24
Unique Entity Identifier (UEI): G9QBQDH39DF4
Parent UEI:
NSF Program(s): CSR-Computer Systems Research
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
01001819DB NSF RESEARCH & RELATED ACTIVIT

01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7924, 9251
Program Element Code(s): 735400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Today's cloud computing systems offer users inexpensive and scalable access to a wide range of software services and resources that expedite development, deployment, and management of software and data. Different cloud vendors compete for these users via pricing, service capability and availability, scale, and ease of use, among other features. Despite their advantages, using any single cloud vendor alone limits user choice, results in vendor lock-in and "data gravity" (i.e., the storing of data with the vendor at which the user has previously stored data), and exposes users to greater risk of failures and privacy violations.

This project addresses these limitations with new systems technologies that enable users to leverage multiple cloud infrastructures at once, safely and easily. In particular, it defines a new software abstraction for the scalable data management (datastore) layer, called DatGeo, that bridges geographically distributed cloud federations. The research will use DatGeo to develop new approaches for efficient transactions, partitioning and replication of data, and policy enforcement and mediation, across clouds.

As a result, DatGeo will shield user applications from the complexities associated with low-level federation of individual cloud services, while facilitating location and privacy control, increased reliability, and transparent cross-cloud use and portability. To enable wide spread use, the project will make its research artifacts and systems prototypes available as open source. In addition, the project will result in new course materials and activities that engage diverse students, new to computer science from local high-school and teaching-focused colleges, to introduce them to computer science as a potential career path.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 53)
Agrawal, D. and El Abbadi, A. "A Wake-up Call: Managing Data in an Untrusted World" IEEE bulletin , v.43 , 2020 Citation Details
Agrawal, D. and El Abbadi, A. and Amiri, M.J. and Maiyya, S. and Zakhary, V. "Blockchains and Databases: Opportunities and Challenges for the Permissioned and the Permissionless" Advances in Databases and Information Systems , 2020 https://doi.org/10.1007/978-3-030-54832-2_1 Citation Details
Ahmad, Ishtiyaque and Sarker, Laboni and Agrawal, Divyakant and El Abbadi, Amr and Gupta, Trinabh "Coeus: A System for Oblivious Document Ranking and Retrieval" Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles , 2021 https://doi.org/10.1145/3477132.3483586 Citation Details
Ahmad, Ishtiyaque and Yang, Yuntian and Agrawal, Divyakant and El Abbadi, Amr and Gupta, Trinabh "Addra: Metadata-private voice communication over fully untrusted infrastructure" 15th USENIX Symposium on Operating Systems Design and Implementation, OSDI2021, July 14-16, 2021 , 2021 Citation Details
Amiri, M. Javad and Allard, T. and Agrawal, D. and El Abbadi A. "PReVer: Towards Private Regulated Verified Data" Proceedings of the 25th International Conference on Extending Database Technology, (EDBT'2022) , 2022 Citation Details
Amiri, Mohammad Javad and Agrawal, Divyakant and Abbadi, Amr El "Modern Large-Scale Data Management Systems after 40 Years of Consensus" IEEE 36th International Conference on Data Engineering , 2020 10.1109/ICDE48307.2020.00172 Citation Details
Amiri, Mohammad Javad and Agrawal, Divyakant and El Abbadi, Amr "On Sharding Permissioned Blockchains" IEEE International Conference on Blockchain , 2019 10.1109/Blockchain.2019.00044 Citation Details
Amiri, Mohammad Javad and Agrawal, Divyakant and El Abbadi, Amr "Permissioned Blockchains: Properties, Techniques and Applications" SIGMOD'2021 , 2021 https://doi.org/10.1145/3448016.3457539 Citation Details
Amiri, Mohammad Javad and Agrawal, Divyakant and El Abbadi, Amr "SharPer: Sharding Permissioned Blockchains Over Network Clusters" International Conference on Management of Data SIGMOD , 2021 https://doi.org/10.1145/3448016.3452807 Citation Details
Amiri, Mohammad Javad and Agrawal, Divyakant El and Abbadi, Amr "CAPER: a cross-application permissioned blockchain" Proceedings of the VLDB Endowment , v.12 , 2019 10.14778/3342263.3342275 Citation Details
Amiri, Mohammad Javad and Duguépéroux, Joris and Allard, Tristan and Agrawal, Divyakant and El Abbadi, Amr "Separ: Towards Regulating Future of Work Multi-Platform Crowdworking Environments with Privacy Guarantees" WWW'2021 , 2021 https://doi.org/10.1145/3442381.3449858 Citation Details
(Showing: 1 - 10 of 53)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Cloud computing promises vast compute, storage, and communication capabilities to individual users as abstract services for very low financial cost. However, the attractive pricing offered by public clouds introduces “hidden” costs in terms of privacy, reliability, and flexibility that are difficult to mitigate. Public cloud vendors seek to create barriers to exit (“lock-in”) for their customers as a way of ensuring revenue long term while offering their services in an on-demand market.  However, once a customer's data is ensconced in a public cloud, it quickly becomes difficult or infeasible to move it, delete it, or access it from different providers.

We address this “data gravity” with new research investigations into DatGeo -- a geo-distributed datastore that is designed to facilitate the federated, concurrent usage of multiple clouds. DatGeo addresses privacy concerns by allowing data to be partitioned among separate cloud providers, but accessed via a single datastore API at the cloud platform level.  It supports cross-site datastore transactions using Replicated Commit to implement geo-distribution. It also extends the datastore to include facilities for geo-partitioning to enhance privacy and geo-replication to enhance reliability beyond what is available from a single cloud provider.  Our approach has been empirical and systems based, resulting in significant open source/data research software artifacts as well as publications that we believe have furthered our collective understanding of federated data management.

Specifically, over the lifetime of this grant, our team has developed a number of new approaches for data management at a global scale.  This includes new approaches to data caching, placement, replication, and scalability as well as support for multi-cloud use (with new approaches for efficient monitoring, anomaly detection, and management of cross-cloud dependencies).  We have also investigated advances that have leveraged permissionless and permissioned blockchains in new ways to facilitate failure tolerance, distributed deployment sharing, and energy efficient optimization.  Our team has also leveraged these building blocks to develop new adaptive scheduling systems for machine learning applications and programming systems that enhance programmer productivity. Finally, our research has been informed by and validated using real applications of societal import, including those from the domains of digital agriculture, environmental sustainability, and smart homes. Our  research artifacts have been made available as open source software and data,  and we have published our work in over 50 high quality venues.  Moreover this project as enabled us to train a large number of students from diverse backgrounds in the next generation of distributed systems. 

 


Last Modified: 09/01/2023
Modified by: Chandra Krintz

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page