
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | June 29, 2018 |
Latest Amendment Date: | June 29, 2018 |
Award Number: | 1813485 |
Award Instrument: | Standard Grant |
Program Manager: |
Daniela Oliveira
doliveir@nsf.gov (703)292-0000 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2018 |
End Date: | August 31, 2023 (Estimated) |
Total Intended Award Amount: | $336,873.00 |
Total Awarded Amount to Date: | $336,873.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
5250 CAMPANILE DR SAN DIEGO CA US 92182-1901 (619)594-5731 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
5500 Campanile Drive San Diego CA US 92182-7455 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CSR-Computer Systems Research |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
A file system is a computer software module, which is in charge of how files are named, stored, and retrieved. Existing memory systems are based on dynamic random-access memory (DRAM), which is reaching its density and power ceiling. Emerging persistent memory technologies like phase change memory (PCM) not only provide a denser energy-efficient alternative to DRAM, but also allow file systems to be built atop them. This project will contribute to memory and storage technologies by developing a new single-level persistent memory architecture and a new file system dedicated to it.
The new architecture will turn a small-size DRAM-based main memory system to a large-capacity persistent memory system that can access files in-place. This project will proceed along two thrusts. First, it will bridge the technology gap in building high-performance persistent memory systems with an in-depth investigation. Second, it will develop the first file system devoted to a single-level persistent store to efficiently managing data, which is increasingly demanded by data-intensive applications.
This project will benefit society by developing high-performance memory systems that will significantly improve the performance and energy-efficiency of future big data applications, which are revolutionizing all aspects of human lives ranging from enterprises to consumers, from science to government. In the long term, techniques developed in this project will be transferable to servers/clusters and even to large-scale distributed storage systems for big data applications, where performance requirements are more stringent. This project will also promote teaching, learning, and training by exposing students to technological and scientific underpinnings in the field of big data storage systems.
The project outcomes including papers published, technical reports, presentations, course modules, and a repository of the software code will be made available for free download at the address http://taoxie.sdsu.edu/, where they will be kept for ten years. The source code of the new persistent file system will also be posted on GitHub.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Emerging byte-addressable persistent memory (PM) technologies like 3D XPoint exhibit a huge potential to become a viable DRAM (dynamic random access memory) alternative for scalable main memories. Compared with traditional DRAM, they possess several desirable features such as a much larger capacity and higher energy-efficiency. Besides, with the advent of a big data era, delivery and analysis of large data sets are often required by data-intensive applications like DNA sequencing and geographic information system. As a result, the conventional compute-centric computing framework is becoming increasingly inadequate as delivering these large data sets all the way from an external storage system to host CPUs (central processing units) incurs substantial data transfer latency and energy consumption. To address these challenges, near-data processing (NDP) is proposed to move computation to data.
To tap the potential of PM technologies and NDP, this project developed an array of new software techniques and tools, which include: (1) a new file system that is completely decoupled from the conventional DRAM-based volatile main memory; (2) two new indexing data structures that run on persistent memory; (3) two new reconfigurable NDP-powered servers; (4) two new kNN (k-Nearest Neighbor) kernels on FPGA (Field Programmable Gate Arrays); and (5) a prototype of a file system middleware.
A new persistent file system called SPFS (Single-level Persistent File System) was developed. It completely bypasses conventional DRAM-based volatile main memory. Unlike all existing PM-oriented file systems, SPFS does not leverage DRAM to manage its metadata. SPFS outperforms traditional DRAM-based in-memory file systems ramfs and tmpfs in most cases. A concurrent and persistent data indexing tree called HART (Hash-assisted Adaptive Radix Tree) was developed. In most cases, HART significantly outperforms WOART (Write Optimal Adaptive Radix Tree) and FPTree (Fingerprinting Persistent Tree), two state-of-the-art persistent trees. Also, it scales well in concurrent scenarios. A persistent dynamic hashing scheme was also developed for persistent memory. It exhibits good performance, high scalability, and quick recovery.
Two reconfigurable NDP servers named RANS (Reconfigurable ARM-based NDP Server) and RFNS (Reconfigurable FPGA-based NDP Server) were developed. Several new findings were obtained, which shed light on how to apply NDP in data centers. For example, we found that while RANS can only benefit data-intensive applications, RFNS can offer benefits for both data-intensive and compute-intensive applications. Moreover, we found that for certain applications the reconfigurability of RANS/RFNS can deliver a noticeable energy efficiency without any performance degradation.
To demonstrate how to achieve NDP by using a hardware accelerator such as FPGA, we implemented two kNN kernels on FPGA: MBFS-kNN (Memory-efficient Brute-Force Searching kNN) and MPCAF-kNN (Memory-efficient Principal Component Analysis based Filtering kNN). The two kernels are adaptive to all key parameters. By comparing them with two cutting-edge kNN implementations on a high-end CPU server, an existing BFS-kNN kernel on FPGA, and an existing BFS-kNN kernel on GPU (Graphics Processing Unit), our experimental results show that the two kernels substantially improve the performance by greatly reducing external memory-accesses. We also developed a file system middleware that optimizes the acquisition of bags, which are specially formatted files used to store timestamped ROS (robot operating system) messages.
Last Modified: 10/26/2023
Modified by: Tao Xie
Please report errors in award information by writing to: awardsearch@nsf.gov.