NSF Award Search: Award # 1907381

Award Abstract # 1907381

CNS Core: Small: Ultra-Efficient Neural Network and LSTM Architectures

NSF Org:	CNS Division Of Computer and Network Systems
Recipient:	THE TRUSTEES OF PRINCETON UNIVERSITY
Initial Amendment Date:	July 9, 2019
Latest Amendment Date:	July 9, 2019
Award Number:	1907381
Award Instrument:	Standard Grant
Program Manager:	Daniela Oliveira doliveir@nsf.gov (703)292-0000 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	October 1, 2019
End Date:	September 30, 2023 (Estimated)
Total Intended Award Amount:	$500,000.00
Total Awarded Amount to Date:	$500,000.00
Funds Obligated to Date:	FY 2019 = $500,000.00
History of Investigator:	Niraj Jha (Principal Investigator)
Recipient Sponsored Research Office:	Princeton University 1 NASSAU HALL PRINCETON NJ US 08544-2001 (609)258-3090
Sponsor Congressional District:	12
Primary Place of Performance:	Princeton University 87 Prospect Avenue, 2nd floor Princeton NJ US 08544-2020
Primary Place of Performance Congressional District:	12
Unique Entity Identifier (UEI):	NJ1YPQXQG7U5
Parent UEI:
NSF Program(s):	Special Projects - CNS, CSR-Computer Systems Research
Primary Program Source:	01001920DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	075Z, 7923
Program Element Code(s):	171400, 735400
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Neural networks (NNs) have begun to have widespread impact on various important applications, such as image recognition, speech recognition, and machine translation. The spurt of interest in machine learning and artificial intelligence in this decade can be traced back to the increase in accuracy that NNs have enabled. Yet, how to come up with the best NN architecture has remained an open problem. Hence, it is attracting a lot of attention from the academia and industry. This work will address this problem.

NN synthesis has largely been limited to big-data applications and the NN models are typically expected to run in the cloud. However, there is recent interest from the industry to have edge-level (e.g., in smartphone or smartwatch) NN models. The current edge-level NNs sacrifice accuracy (by 4-5%) for energy and latency efficiency. NNs are also often not competitive with other models for medium-data and small-data applications. Finally, sequence-to-sequence modeling (e.g., for language translation) also needs to be made much more accurate, fast, and compact enough for edge devices. All these problems will be tackled in this work through new NN synthesis techniques and tools.

This research has the potential to enable transformative advances in overcoming the deficiencies of current NN synthesis methodologies. Due to the explosion in machine learning applications, this research has the promise to provide a significant boost to U.S. companies and economy. Thus, it will involve significant industrial engagements. Several underrepresented (minority/female) will be involved in the research. The research outcomes will be included in two undergraduate courses on Machine Learning and Embedded Computing. Broad dissemination to the academic and industrial communities will be achieved through published papers, posters, and seminars. Additionally, various tools and models will be distributed online.

The list of publications/students and tools/data with appropriate documentation will be made available at https://www.princeton.edu/~jha/. Free use of data and artifacts will be permitted for research and educational purposes. The data will be available online for at least five years following the completion of the project.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Xia, Wenhan and Yin, Hongxu and Jha, Niraj K. "INVITED: Efficient Synthesis of Compact Deep Neural Networks" IEEE Design Automation Conference , 2020 https://doi.org/10.1109/DAC18072.2020.9218529 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Neural networks (NNs) have become the workhorse of the Artificial Intelligence (AI) industry. A decade ago, the trend was to design large NNs trained on big data. However, soon it was realized that edge-friendly NNs could have much broader applications. These NNs have to be highly energy-, memory- and latency-efficient since many edge devices are battery-operated. Such devices include smartphones and smartwatches. This project was focused on the design and applications of such NNs.

Many strong pillars are needed to support edge-friendly NNs. The main one being a tool that synthesizes energy/memory/latency-efficient NNs from given datasets. We have developed a tool, called SCANN, that results in NNs that are, on an average, two orders of magnitude smaller than traditional fully-connected NNs, yet at the same time boost the NN accuracy. SCANN introduced a grow-and-prune synthesis paradigm modeled after the human brain. Starting from a seed architecture, akin to a baby brain, it adds neurons and connections to the NN to significantly improve its accuracy, ending in an NN that is akin to a toddler brain. Finally, SCANN specializes this NN into an adult-brain-like architecture that performs very accurately on the given application, but is orders of magnitude smaller. We extended SCANN to a tool called CURIOUS to fully exploit the grow-and-prune synthesis paradigm.

Edge applications typically do not have large training datasets. However, NNs are very data-hungry. Thus, another important pillar is a synthetic data generator that can sample unlimited synthetic data instances from the same probability distribution as the real dataset. We have developed one such tool, called TUTOR. It provides a significant boost to accuracy of NNs. In medical applications, it also ensures privacy when real data cannot be shared. Instead, synthetic data can be shared without sacrificing on the accuracy of the models trained with such data.

A third pillar is finding out which data instances are incorrectly labeled. In supervised machine learning, it is extremely important that the labels be correct. Otherwise, prediction accuracy suffers. We developed a tool, called CTRL, that automatically determines which labels are wrong. Removing the data instances with wrong labels boosts the accuracy of NNs significantly. CTRL is applicable to both tabular and image-based datasets. Thus, it is widely applicable across diverse Internet-of-Things (IoT) applications. It is an important preprocessing step before NN training commences.

Real-world datasets often have missing values. Since the dataset sizes are often small, it is not simply possible to delete all data instances that have any missing values. Thus, imputing these values becomes important. We have developed a tool, called DINI, that imputes missing values and obtains state-of-the-art results in terms of NN accuracy.

Edge-friendly NNs make new smart healthcare applications possible. We have developed a framework, called DOCTOR, that produces a multi-headed NN. The body of the NN can be trained with wearable sensor data that can be obtained from sensors found on the smartwatch or smartphone and the different heads can detect different diseases. We have shown how DOCTOR can detect Covid-19, Type I/II diabetes, and mental health disorders (e.g., depression, bipolar disorder, and schizophrenia) through the multiple heads. This multi-headed NN is highly energy/memory/latency-efficient and can reside on a smartwatch.

Convolutional neural networks (CNNs) are often used in image-based applications. These are often very large models and thus their inference latency is often not compatible with latency-sensitive applications. We have developed a dynamic inference method that skips many layers of the CNN depending on the image it encounters at inference time, without giving up on prediction accuracy.

Four PhD students were trained in this exciting area of research. Their work resulted in several journal articles. The results were disseminated to the wider public through various invited talks. The students also did technology transfer through their summer internships at various companies. The frameworks were used in various course projects. Several tools/frameworks developed in this project are also being commercialized by a startup company.

Last Modified: 10/15/2023
Modified by: Niraj K Jha

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error