NSF Award Search: Award # 2114892

Award Abstract # 2114892

Collaborative Research: EAGER SaTC-EDU: Artificial Intelligence and Cybersecurity: From Research to the Classroom

NSF Org:	DGE Division Of Graduate Education
Recipient:	UNIVERSITY OF MARYLAND BALTIMORE COUNTY
Initial Amendment Date:	April 16, 2021
Latest Amendment Date:	May 2, 2024
Award Number:	2114892
Award Instrument:	Standard Grant
Program Manager:	Li Yang liyang@nsf.gov (703)292-2677 DGE Division Of Graduate Education EDU Directorate for STEM Education
Start Date:	May 1, 2021
End Date:	April 30, 2025 (Estimated)
Total Intended Award Amount:	$219,993.00
Total Awarded Amount to Date:	$219,993.00
Funds Obligated to Date:	FY 2021 = $219,993.00
History of Investigator:	Timothy Finin (Principal Investigator) finin@umbc.edu Alan Sherman (Co-Principal Investigator) Anupam Joshi (Co-Principal Investigator)
Recipient Sponsored Research Office:	University of Maryland Baltimore County 1000 HILLTOP CIR BALTIMORE MD US 21250-0001 (410)455-3140
Sponsor Congressional District:	07
Primary Place of Performance:	University of Maryland Baltimore County 1000 Hilltop CIrcle Baltimore MD US 21250-0001
Primary Place of Performance Congressional District:	07
Unique Entity Identifier (UEI):	RNKYWXURFRL5
Parent UEI:
NSF Program(s):	Secure &Trustworthy Cyberspace
Primary Program Source:	01002122DB NSF RESEARCH & RELATED ACTIVIT 04002122DB NSF Education & Human Resource
Program Reference Code(s):	7916, 093Z, 025Z
Program Element Code(s):	806000
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.076

ABSTRACT

One of the most critical security challenges of the 21st century is protecting the cyber-physical systems that manage and control our infrastructure, vehicles, homes, and personal devices as well as the information that they store, use and exchange. Artificial intelligence (AI) and machine learning-based tools can help human analysts sort through large volumes of data to determine if an attack on these systems has happened. Yet, AI components are also vulnerable to attacks, and require development of techniques to make them more robust. This collaborative project between the University of Maryland Baltimore County (UMBC) and the University of Illinois addresses the research and educational aspects of combining AI and cybersecurity. Educational and training materials will be developed for use by college and university instructors and students and by cybersecurity and AI professionals. These materials will address how AI can improve security systems and how cybersecurity analytics can protect AI systems. In addition, the project will recruit students from groups that have been traditionally underrepresented in computing.

This project has three interrelated topics. The first focuses on education and extends the project team?s existing cybersecurity concept inventory to include relevant AI-related concepts. Student knowledge and understanding of cybersecurity and AI relatedness will be assessed before and after taking AI or cybersecurity courses. Educational materials and projects will also be created to demonstrate how AI can be applied to cybersecurity problems and how cybersecurity tools can protect AI systems from attack. The second topic explores how the latest AI tools can support cybersecurity tasks. The creation and maintenance of semantic knowledge graphs of cyberthreat information will be studied and used to support reinforcement learning systems that are better at detecting the presence of malware in a host. The third topic focuses on finding new ways that cybersecurity tools can protect AI systems from becoming compromised by attacks such as data poisoning. Cyberthreat knowledge graphs and neural networks will be used to detect and eliminate likely disinformation from data used to train AI-based cybersecurity systems. This aspect of the project has applications beyond cybersecurity, such as countering disinformation.

This project is supported by the Secure and Trustworthy Cyberspace (SaTC) program, which funds proposals that address cybersecurity and privacy, and in this case specifically cybersecurity education. The SaTC program aligns with the Federal Cybersecurity Research and Development Strategic Plan and the National Privacy Research Strategy to protect and preserve the growing social and economic benefits of cyber systems while ensuring security and privacy.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 17)

Show All

Chukkapalli, Sai Sree and Joshi, Anupam and Finin, Tim and Erbacher, Robert F. "CAPD: a context-aware, policy-driven framework for secure and resilient IoBT operations" Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV, SPIE Defense + Commercial Sensing , 2022 https://doi.org/10.1117/12.2618106 Citation Details

Das, Nilanjana and Kotal, Anantaa and Roseberry, Daniel and Joshi, Anupam "Change Management using Generative Modeling on Digital Twins" IEEE International Conference on Intelligence and Security Informatics , 2023 https://doi.org/10.1109/ISI58743.2023.10297181 Citation Details

Hamid, Aamir and Samidi, Hemanth Reddy and Finin, Tim and Pappachan, Primal and Yus, Robert "GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants" Proceedings on Privacy Enhancing Technologies , 2024 Citation Details

Hanks, Casey and Maiden, Michael and Ranade, Priyanka and Finin, Tim and Joshi, Anupam "Recognizing and Extracting Cybersecurity Entities from Text" Workshop on Machine Learning for Cybersecurity, International Conference on Machine Learning , 2022 Citation Details

Kotal, Anantaa and Das, Nilanjana and Joshi, Anupam "Knowledge Infusion in Privacy Preserving Data Generation" KDD Workshop on Knowledge-infused Learning, 29TH ACM SIGKDD , 2023 Citation Details

Kotal, Anantaa and Elluri, Lavanya and Gupta, Deepti and Mandalapu, Varun and Joshi, Anupam "Privacy-Preserving Data Sharing in Agriculture: Enforcing Policy Rules for Secure and Confidential Data Synthesis" IEEE International Conference on Big Data , 2023 https://doi.org/10.1109/BigData59044.2023.10386276 Citation Details

Kumar, Vijay S and Mulwad, Varish and Williams, Jenny Weisenberg and Finin, Tim and Dixit, Sharad and Joshi, Anupam "Knowledge Graph-driven Tabular Data Discovery from Scientific Documents" CEUR workshop proceedings , 2013 Citation Details

Mulwad, Varish and Kumar, Vijay S and Williams, Jenny Weisenberg and Finin, Tim and Dixit, Sharad and Joshi, Anupam "Towards Semantic Exploration of Tables in Scientific Documents" Workshop on Semantic Technologies for Scientific, Technical and Legal Data, Extended Semantic Web Conference , 2023 Citation Details

Padia, Ankur and Ferraro, Francis and Finin, Tim "Enhancing Knowledge Graph Consistency through Open Large Language Models: A Case Study" Proceedings of AAAI-MAKE: Empowering Machine Learning and Large Language Models with Domain and Commonsense Knowledge , 2023 Citation Details

Piplai, Aritran and Anoruo, Mike and Fasaye, Kayode and Joshi, Anupam and Finin, Tim and Ridley, Ahmad "Knowledge Guided Two-player Reinforcement Learning for Cyber Attacks and Defenses" 21st IEEE International Conference on Machine Learning and Applications , 2022 https://doi.org/10.1109/icmla55696.2022.00213 Citation Details

Piplai, Aritran and Joshi, Anupam and Finin, Tim "Offline RL+CKG: A hybrid AI model for cybersecurity tasks" Proceedings of the AAAI 2023 Spring Symposium on Challenges Requiring the Combination of Machine Learning and Knowledge Engineering (AAAI-MAKE 2023) , 2023 Citation Details

(Showing: 1 - 10 of 17)

Show All

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This collaborative project between the University of Maryland, Baltimore County and the University of Illinois Urbana-Champaign addressed both the research and educational aspects of applying AI technology to cybersecurity. Our goals were to (1) carry out novel research on applying the latest AI techniques to cybersecurity problems and explore how attacks on AI systems can be mitigated, (2) extend our work on evaluating students understanding of the underlying security concepts to include AI-related topics, and (3) create and evaluate examples for undergraduate, graduate and professional courses on both cybersecurity and AI to cover the concepts, examples, and tools illustrating how they can support one another. At UMBC the project provided partial support for three PhD students and multiple undergraduate students.

We worked with a group of undergraduate students who learned how to build language understanding pipelines using the spaCy NLP tools. They built a corpus of text about cybersecurity by scraping relevant text from web pages and documents, constructed a set of domain-relevant entity types for cybersecurity, configured and applied annotation framework tools, and used the Prodigy annotation system to create training and evaluation datasets. Additional modules were also created using regular expressions to recognize and extract cybersecurity-relevant entities from text, such as URLs, email addresses, IP addresses, hash values, and process identifiers. The results of this work were presented and published in several venues.

We worked with the second group of students who explored how reinforcement learning (RL) can be used to build better tools to detect malware infecting a computer. The group learned how to collect and use data from Virustotal using the tasks in the Machine Learning Security Evasion Competition challenge as a problem framework. The group experimented with using RL techniques for the defender challenge, in which contestants develop malware detection models to be tested in the later attacker challenge.

We also studied the problem of privacy-preserving data generation, which involves creating new data that maintains privacy while preserving key characteristics and properties of the original data so that it is still useful in creating downstream models of attacks. We explored a technique we call Knowledge Infused Privacy Preserving Data Generation that uses a generative adversarial network trained on system data for generating synthetic datasets that can replace original data for tasks while protecting sensitive data. We demonstrated this model by synthesizing network data captured by the Wireshark network capture tool and showed that the synthetic dataset holds up to the constraints of the network-specific datasets and can replace the original dataset in downstream tasks. We also applied it to sharing agricultural data.

We conducted research on building and using large language models (LLMs) for cybersecurity applications. This required creating a corpus of cybersecurity-relevant text for enhancing existing LLMs and exploring several use cases for the resulting model. To evaluate how well LLMs are at understanding cybersecurity problems, we used several public LLM systems to answer questions developed for evaluating how well students understand cybersecurity concepts and found that the systems did surprisingly well. We also studied using the emergent reasoning capabilities of large language models (LLMs) to detect inconsistencies between extracted facts and their provenance. We investigated the effects of architecture, such as Encoder-Decoder and Decoder, size, and the impact of entities on the identification capabilities of LLMs.

Last Modified: 07/06/2025
Modified by: Timothy W Finin

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error