Award Abstract # 1149051
CAREER: Automatic Learning of Adaptive Network-Centric Malware Detection Models

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF GEORGIA RESEARCH FOUNDATION, INC.
Initial Amendment Date: May 11, 2012
Latest Amendment Date: June 13, 2016
Award Number: 1149051
Award Instrument: Continuing Grant
Program Manager: Phillip Regalia
pregalia@nsf.gov
 (703)292-2981
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2012
End Date: May 31, 2018 (Estimated)
Total Intended Award Amount: $402,601.00
Total Awarded Amount to Date: $402,601.00
Funds Obligated to Date: FY 2012 = $157,137.00
FY 2014 = $79,079.00

FY 2015 = $81,785.00

FY 2016 = $84,600.00
History of Investigator:
  • Roberto Perdisci (Principal Investigator)
    perdisci@cs.uga.edu
Recipient Sponsored Research Office: University of Georgia Research Foundation Inc
310 E CAMPUS RD RM 409
ATHENS
GA  US  30602-1589
(706)542-5939
Sponsor Congressional District: 10
Primary Place of Performance: University of Georgia
200 D.W. Brooks Drive
ATHENS
GA  US  30602-5016
Primary Place of Performance
Congressional District:
10
Unique Entity Identifier (UEI): NMJHD63STRC5
Parent UEI:
NSF Program(s): Special Projects - CNS,
Secure &Trustworthy Cyberspace
Primary Program Source: 01001213DB NSF RESEARCH & RELATED ACTIVIT
01001415DB NSF RESEARCH & RELATED ACTIVIT

01001516DB NSF RESEARCH & RELATED ACTIVIT

01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 7434
Program Element Code(s): 171400, 806000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Malicious software (a.k.a. malware) is at the basis of most cyber-criminal operations, causing significant financial loss and posing great risks to national security.

This research creates novel network-centric behavior-based malware detection systems that automatically learn how to identify malware-compromised machines within a network, and that can self-tune to achieve the best possible trade-off between malware detection rate and false alarms for a given network. This self-tuning property is achieved by combining models of malware-generated network traffic with models of legitimate user-generated network activities to build hybrid detection models that can adapt to a specific network environment and accurately detect malware-generated network traffic crossing the network perimeter.

This new approach to malware detection takes into account events that occur within an entire network, rather than focusing on events that occur at each single host, and focuses on adaptive detection of all types of malware, rather than being limited to a specific malware type (e.g., botnets). Therefore, the detection systems resulting from this research will provide new effective detection capabilities that can complement current anti-malware technologies and significantly contribute to a better defense-in-depth strategy against malware.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Babak Rahbarinia, Marco Balduzzi, Roberto Perdisci "Exploring the Long Tail of (Malicious) Software Downloads" IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) , 2017
Babak Rahbarinia, Marco Balduzzi, Roberto Perdisci "Real-Time Detection of Malware Downloads via Large-Scale URL->File->Machine Graph Mining" ACM Symposium on InformAtion, Computer and Communications Security , 2016
Babak Rahbarinia, Roberto Perdisci, Andrea Lanzi, and Kang Li "PeerRush: Mining for unwanted P2P traffic" Journal of Information Security and Applications , v.19 , 2014 http://dx.doi.org/10.1016/j.jisa.2014.03.002
Babak Rahbarinia, Roberto Perdisci, Manos Antonakakis "Efficient and Accurate Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks" ACM Transactions on Privacy and Security , v.19 , 2016 10.1145/2960409
Bo Li, Phani Vadrevu, Kyu Hyung Lee, and Roberto Perdisci "JSgraph: Enabling Reconstruction of Web Attacks via Efficient Tracking of Live In-Browser JavaScript Executions" Network and Distributed System Security Symposium, NDSS 2018 , 2018
Phani Vadrevu and Roberto Perdisci "MAXS: Scaling Malware Execution with Sequential Multi-Hypothesis Testing" ACM Symposium on InformAtion, Computer and Communications Security , 2016
Phani Vadrevu, Jienan Liu, Bo Li, Babak Rahbarinia, Kyu Hyung Lee, Roberto Perdisci "Enabling Reconstruction of Attacks on Users via Efficient Browsing Snapshots" Network and Distributed System Security Symposium (NDSS) , 2017
Roberto Perdisci, Davide Ariu, Giorgio Giacinto "Scalable Fine-Grained Behavioral Clustering of HTTP-Based Malware" Computer Networks , v.57 , 2013 , p.487?500 http://dx.doi.org/10.1016/j.comnet.2012.06.022
Terry Nelms, Roberto Perdisci, Manos Antonakakis, Mustaque Ahamad "Towards Measuring and Mitigating Social Engineering Malware Download Attacks" USENIX Security Symposium , 2016

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Malicious software (or malware) is the tool of choice used by cyber-criminals to perpetrate a large variety of computer and network attacks. For instance, attackers can infect thousands, or even millions of machines with malware, and remotely control them to launch devastating coordinated attacks against victim networks.

This CAREER project focuses on the network behavior of malware as a way of detecting malware-infected machines and blocking their malicious activities.

 

Detecting malware infected machines

Because modern malware typically needs to communicate with a remote attacker to receive commands, report stolen information, launch further attacks, etc., a malware-infected machine will inevitably generate anomalous network traffic. This research therefore focuses on modeling the network behavior of malware using statistical machine learning approaches. This allows us to generate statistical fingerprints that can be matched against a machine’s network traffic to determine if the machine has been infected with malware.

 

One of the main challenges related to using statistical models to detect malware-generated network traffic is the possibility of false positives, namely benign activities that are misclassified as malicious. To mitigate false positives, this project focuses on building hybrid statistical models that capture the inherent features of both malware-generated network traffic and the “background” benign traffic generated by the networks where the malware detection models are to be deployed.

 

Following the approach outlined above, we have developed several systems that aim to advance the state-of-the-art of network-based malware detection. For instance, ExecScent is a system dedicated to detecting malware-infected machines in large enterprise networks, by focusing on analyzing web traffic, which is often used by malware as a way to hide in plain sight among the huge amount of legitimate web traffic generated by the victim networks. We also developed PeerRush, a system for detecting malware that uses peer-to-peer communications as a form of command-and-control. In addition, we have also leveraged malware-generated network traffic to improve the scalability of our machine learning-based systems, by inventing new algorithms to cluster malware samples into families and to discard duplicate malware samples that do not contribute to learning new malicious network behaviors.



Detecting web-based malware downloads

While the first part of this research focuses on detecting malware-infected machines, by recognizing their malware-related network traffic, we also dedicate intense efforts on detecting malware downloads. Namely, we focus on detecting malware infections before they actually occur, while the victim machine is in the process of downloading (and before executing) a piece of malicious software.

 

Along this research direction, we developed several systems dedicated to reconstructing studying, and detecting the network traffic generated by to-be victim machines. For instance, we developed Amico, an open-source software for malware download detection that has so far been deployed in three large university campus networks. We also developed a system for studying malware downloads triggered by social engineering attacks. Specifically, we have built a modified version of the Chrome browser that allows us to log detailed information about the user’s actions in the moments before a malicious software download occurred. Our system allows forensic analysts to travel back in time and reconstruct with high accuracy all browser-internal events involved in malware download attacks.

 

Educational outcomes

This CAREER award has supported the research work published by four PhD students in their respective thesis. Furthermore, the PI has directed other minor research projects linked to this award at both the undergraduate and graduate level. Other educational outcomes include the establishment of a new graduate course in network security at the University of Georgia that covers topics related to network traffic analysis, modeling, and the use of machine learning for solving computer and network security problems in general.


 

 


Last Modified: 09/28/2018
Modified by: Roberto Perdisci

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page