Award Abstract # 2103563
CDS&E: Fast Search of Growing High-Dimensional Big Data to Enable Accurate Semiclassical Molecular Dynamics Studies of Large Molecular Systems

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: TEXAS TECH UNIVERSITY SYSTEM
Initial Amendment Date: March 22, 2021
Latest Amendment Date: May 21, 2021
Award Number: 2103563
Award Instrument: Standard Grant
Program Manager: Sheikh Ghafoor
sghafoor@nsf.gov
 (703)292-7116
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2021
End Date: May 31, 2026 (Estimated)
Total Intended Award Amount: $278,348.00
Total Awarded Amount to Date: $278,348.00
Funds Obligated to Date: FY 2021 = $278,348.00
History of Investigator:
  • Yu Zhuang (Principal Investigator)
    yu.zhuang@ttu.edu
Recipient Sponsored Research Office: Texas Tech University
2500 BROADWAY
LUBBOCK
TX  US  79409
(806)742-3884
Sponsor Congressional District: 19
Primary Place of Performance: Texas Tech University
Lubbock
TX  US  79409-3104
Primary Place of Performance
Congressional District:
19
Unique Entity Identifier (UEI): EGLKRQ5JBCZ7
Parent UEI:
NSF Program(s): CDS&E
Primary Program Source: 01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 026Z, 8084
Program Element Code(s): 808400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Quantum effects are inherent factors for material properties and chemical processes. By capturing quantum effects with good quantitative accuracy, ab initio semiclassical molecular dynamics simulation is a generally applicable investigation tool for a broad range of chemical and material science studies, including studies on pollutant effects on lung health, enzyme catalysis, ozone depletion, space craft surface coating, solar cells, and a lot more studies that promise to advance national health and pharmaceutical sciences, material design investigations for national defense, energy and environmental protection researches, etc. But the computation cost of semiclassical dynamics simulations is enormously high, making semiclassical dynamics highly challenging, and even infeasible in many cases, for large molecular systems. This project proposed methods for reducing computation cost while maintaining simulation accuracy, which will expand the reach of semiclassical dynamics study to a broader range of studies of national and scientific importance.

Ab initio semiclassical molecular dynamics simulation has enormous computation cost in calculating ab initio Hessians from quantum mechanical electronic structure theories. Hessian modeling using training data in the closest time distances from a set of saved ab initio data has been successful in reducing the cost of Hessian calculations while maintaining simulation accuracy. It was observed that opportunities exist for further reduction of computation cost by using training data in the closest spatial distances, which offers more chances for Hessian modeling to replace ab initio Hessian. Due to the frequent incoming of new ab initio data, the ab initio data set is constantly growing. To search frequently updated growing datasets, a challenge is that the algorithms not only need to achieve high search efficiency but also have to be efficient for re-organizing the dataset with frequent insertions of new data. Existing searching algorithms are good in search efficiency but not so good in data-organizing efficiency since they were designed for static or infrequently updated datasets. This project develops search algorithms that will be the first to leverage the growing process of datasets to deliver high efficiency in both searching and data organizing. Hessian modeling using training data of closest spatial distance returned by the new search algorithms has the potential for further reduction of computation cost, promising to speed up dynamics simulations and enable simulations of larger molecular systems and/or the use of higher-accuracy electronic structure theories to capture better details of the molecular systems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Aftan, Sulaiman and Zhuang, Yu and Aseeri, Ahmad O and Shah, Habib "Steering a Standard Arab Language Processing Model Towards Accurate Saudi Dialect Sentiment Analysis Using Generative AI" , 2024 https://doi.org/10.1109/BigData62323.2024.10825944 Citation Details
Docaj, Andris and Zhuang, Yu "Speeding Up Unsupervised Learning Through Early Stopping" , 2023 https://doi.org/10.1109/BigData59044.2023.10386405 Citation Details
Li, Gaoxiang and Zhuang, Yu "Rethinking PUF Design for Scalable Edge AI: A Position on Balancing ML-Attack Resistance and Real-World Deployment" , 2025 Citation Details
Thapaliya, Bipana and Mursi, Khalid T. and Zhuang, Yu "Machine Learning-based Vulnerability Study of Interpose PUFs as Security Primitives for IoT Networks" 2021 IEEE International Conference on Networking, Architecture and Storage (NAS) , 2021 https://doi.org/10.1109/NAS51552.2021.9605405 Citation Details
Zhuang, Yu and Li, Gaoxiang and Mursi, Khalid T. "A Permutation Challenge Input Interface for Arbiter PUF Variants Against Machine Learning Attacks" 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) , 2022 https://doi.org/10.1109/ISVLSI54635.2022.00094 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page