
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | May 15, 2017 |
Latest Amendment Date: | July 23, 2020 |
Award Number: | 1703560 |
Award Instrument: | Continuing Grant |
Program Manager: |
Marilyn McClure
mmcclure@nsf.gov (703)292-5197 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | July 1, 2017 |
End Date: | June 30, 2023 (Estimated) |
Total Intended Award Amount: | $1,175,545.00 |
Total Awarded Amount to Date: | $1,183,545.00 |
Funds Obligated to Date: |
FY 2018 = $8,000.00 FY 2019 = $295,164.00 FY 2020 = $304,684.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
3227 CHEADLE HALL SANTA BARBARA CA US 93106-0001 (805)893-4188 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Santa Barbara CA US 93106-5110 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CSR-Computer Systems Research |
Primary Program Source: |
01001819DB NSF RESEARCH & RELATED ACTIVIT 01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Today's cloud computing systems offer users inexpensive and scalable access to a wide range of software services and resources that expedite development, deployment, and management of software and data. Different cloud vendors compete for these users via pricing, service capability and availability, scale, and ease of use, among other features. Despite their advantages, using any single cloud vendor alone limits user choice, results in vendor lock-in and "data gravity" (i.e., the storing of data with the vendor at which the user has previously stored data), and exposes users to greater risk of failures and privacy violations.
This project addresses these limitations with new systems technologies that enable users to leverage multiple cloud infrastructures at once, safely and easily. In particular, it defines a new software abstraction for the scalable data management (datastore) layer, called DatGeo, that bridges geographically distributed cloud federations. The research will use DatGeo to develop new approaches for efficient transactions, partitioning and replication of data, and policy enforcement and mediation, across clouds.
As a result, DatGeo will shield user applications from the complexities associated with low-level federation of individual cloud services, while facilitating location and privacy control, increased reliability, and transparent cross-cloud use and portability. To enable wide spread use, the project will make its research artifacts and systems prototypes available as open source. In addition, the project will result in new course materials and activities that engage diverse students, new to computer science from local high-school and teaching-focused colleges, to introduce them to computer science as a potential career path.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Cloud computing promises vast compute, storage, and communication capabilities to individual users as abstract services for very low financial cost. However, the attractive pricing offered by public clouds introduces “hidden” costs in terms of privacy, reliability, and flexibility that are difficult to mitigate. Public cloud vendors seek to create barriers to exit (“lock-in”) for their customers as a way of ensuring revenue long term while offering their services in an on-demand market. However, once a customer's data is ensconced in a public cloud, it quickly becomes difficult or infeasible to move it, delete it, or access it from different providers.
We address this “data gravity” with new research investigations into DatGeo -- a geo-distributed datastore that is designed to facilitate the federated, concurrent usage of multiple clouds. DatGeo addresses privacy concerns by allowing data to be partitioned among separate cloud providers, but accessed via a single datastore API at the cloud platform level. It supports cross-site datastore transactions using Replicated Commit to implement geo-distribution. It also extends the datastore to include facilities for geo-partitioning to enhance privacy and geo-replication to enhance reliability beyond what is available from a single cloud provider. Our approach has been empirical and systems based, resulting in significant open source/data research software artifacts as well as publications that we believe have furthered our collective understanding of federated data management.
Specifically, over the lifetime of this grant, our team has developed a number of new approaches for data management at a global scale. This includes new approaches to data caching, placement, replication, and scalability as well as support for multi-cloud use (with new approaches for efficient monitoring, anomaly detection, and management of cross-cloud dependencies). We have also investigated advances that have leveraged permissionless and permissioned blockchains in new ways to facilitate failure tolerance, distributed deployment sharing, and energy efficient optimization. Our team has also leveraged these building blocks to develop new adaptive scheduling systems for machine learning applications and programming systems that enhance programmer productivity. Finally, our research has been informed by and validated using real applications of societal import, including those from the domains of digital agriculture, environmental sustainability, and smart homes. Our research artifacts have been made available as open source software and data, and we have published our work in over 50 high quality venues. Moreover this project as enabled us to train a large number of students from diverse backgrounds in the next generation of distributed systems.
Last Modified: 09/01/2023
Modified by: Chandra Krintz
Please report errors in award information by writing to: awardsearch@nsf.gov.