
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 15, 2018 |
Latest Amendment Date: | August 15, 2018 |
Award Number: | 1814105 |
Award Instrument: | Standard Grant |
Program Manager: |
Ann Von Lehmen
CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2018 |
End Date: | September 30, 2021 (Estimated) |
Total Intended Award Amount: | $250,000.00 |
Total Awarded Amount to Date: | $250,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
2550 NORTHWESTERN AVE # 1100 WEST LAFAYETTE IN US 47906-1332 (765)494-1055 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
305 University 479072114 IN US 47907-2114 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Special Projects - CNS |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The networks that comprise the Internet are fundamental to our society, facilitating access to medical and financial services, supporting critical infrastructure such as the power grid, and enabling emergent services such as those provided by autonomous cars and IoT (Internet of Things) devices. Network behavior is dictated by a set of instructions, or protocols, developed and tested over time. Such protocols must operate correctly and comply with requirements that are usually described in a document(s), i.e., in a textual representation. If they do not operate properly, the performance and security of a network could be compromised. The goal of this project is to increase assurance in network protocols, specifically in their compliance to specified rules, in their inter-operability and in their functionality. This project will accomplish this via a novel scheme to perform protocol testing through automated extraction of protocol requirements from their textual specification. This would mark a significant advance in the field, towards automated mechanisms that assure that network protocols are behaving as we expect them to, making networks more reliable and secure.
This multidisciplinary project combines expertise from natural language processing and computer networks to create methodologies, frameworks, a knowledge base, and tools for protocol validation for (1) compliance checking, (2) bug finding, and (3) interoperability testing. The general approach is to apply machine learning, semantic parsing and information extraction techniques to structured text (RFCs, internet-drafts) and unstructured text (blogs, forums, and bug reports), and create a knowledge base about the protocols, containing formal information such as message formats, protocol state machine, constraints, and semi-formal information such as temporal properties, tuning conditions and parameters, changes from one version to another, or known bugs. This information is organized into a knowledge base and used to validate protocol implementations through protocol fuzzying, program analysis, software model checking, and measurement methods, to check whether protocols are compliant with their specifications, to detect semantic bugs dependent on intrinsic protocol properties, or check for interoperability issues between different versions, or protocol stacks. This work is guided by protocols from three representative domains -- TCP variants, the SDN ecosystem, and IoT smart home environment.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The major goal of this project is to increase assurance in network protocols, specifically in their compliance to specified rules, in their interoperability, and in their functionality. This multidisciplinary project combines expertise from natural language processing and computer networks to create methodologies, frameworks, a knowledge base, and tools for protocol validation.
The major outcome of this work is an approach that allows for automated extraction of protocol finite state machines from RFC specifications. RFCs are a common way of specifying Internet protocols. Our hybrid approach consisting of three key steps: (1) large-scale word-representation learning for technical language, (2) focused zero-shot learning for mapping protocol text to a protocol-independent information language, and (3) rule-based mapping from protocol-independent information to a specific protocol FSM. The first step does not require direct annotation, and does not add to the human effort involved in building the model. Our zero-shot information extraction approach builds on that representation. Since each protocol consists of its own set of predicates and variables, we suggest a zero-shot approach in which we separate between protocols observed during training and testing. The model learns to identify and connect concepts relevant for the training protocols and at test time it is evaluated on extracting a set of symbols which were not observed at training. We show the generalizability of our FSM extraction by using the RFCs for six different protocols: BGPv4, DCCP, LTP, PPTP, SCTP and TCP. The extracted FSM can further be used for protocol validation. We demonstrated how automated extraction of an FSM from an RFC can be applied to the synthesis of attacks, with TCP and DCCP as case-studies.
Work developed in this grant will contribute to increased assurance on protocol design and implementation. As the Internet consists of a myriad of protocols, this grant contributes to making the Internet infrastructure more resilient to failures and attacks.
This work contributed to the education and training of several PhD students and undergraduate students through the research they conducted with support from this grant. We disseminated our results in top venues in NLP and network security journals and conferences.
Last Modified: 01/03/2022
Modified by: Dan Goldwasser
Please report errors in award information by writing to: awardsearch@nsf.gov.