Skip to feedback

Award Abstract # 1816019
SaTC: CORE: Small: Collaborative: ForensicExaminer: Testbed for Benchmarking Digital Audio Forensic Algorithms

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: REGENTS OF THE UNIVERSITY OF MICHIGAN
Initial Amendment Date: September 4, 2018
Latest Amendment Date: July 12, 2022
Award Number: 1816019
Award Instrument: Standard Grant
Program Manager: Dan Cosley
dcosley@nsf.gov
 (703)292-8832
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 15, 2018
End Date: January 31, 2023 (Estimated)
Total Intended Award Amount: $291,471.00
Total Awarded Amount to Date: $331,471.00
Funds Obligated to Date: FY 2018 = $291,471.00
FY 2019 = $8,000.00

FY 2020 = $16,000.00

FY 2021 = $8,000.00

FY 2022 = $8,000.00
History of Investigator:
  • Hafiz Malik (Principal Investigator)
    hafiz@umich.edu
Recipient Sponsored Research Office: Regents of the University of Michigan - Ann Arbor
1109 GEDDES AVE STE 3300
ANN ARBOR
MI  US  48109-1015
(734)763-6438
Sponsor Congressional District: 06
Primary Place of Performance: University of Michigan - Dearborn
4901 Evergreen Rd.
Dearborn
MI  US  48128-2406
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): GNJ7BBP73WE9
Parent UEI:
NSF Program(s): Special Projects - CNS,
Secure &Trustworthy Cyberspace
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
01001819DB NSF RESEARCH & RELATED ACTIVIT

01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 025Z, 7434, 7923, 9178, 9251
Program Element Code(s): 171400, 806000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The proliferation of powerful smart-computing devices (e.g., smartphones, surveillance systems) capable of production, editing, analysis, and sharing of multimedia files and associated technological advances have affected almost every aspect of our lives. The use of digital multimedia (images, audio, and video) as evidence is rapidly growing in multiple applications, including legal proceedings and law enforcement. However, forensic audio examiners are facing a new challenge of analyzing evidence containing audio from social networking websites, because audio editing and manipulation tools are both sophisticated and easy to use, increasing the risk of audio forgery. The goal of this project is to develop a framework and methods to support audio forensics examiners in detecting and localizing tampering in audio files, including developing novel algorithms to associate files to specific recording devices; creating methods to detect and estimate the risks of attempts to evade existing forgery detectors; evaluating speaker recognition systems in the presence of audio replay attacks; and collecting a large and diverse dataset of recordings that can be used for benchmarking of existing and future audio forensic analysis tools and techniques. The project also has a significant educational component, consisting of a set of hands-on activities involving media generation, manipulation, and analysis aimed at outreach and broadening participation in science, technology, engineering and mathematics (STEM) disciplines including forensic science, digital signal processing, and statistical data analysis and digital forensics.

The project is has four main research thrusts. The first will involve designing effective microphone fingerprint modeling and extraction algorithms tailored for audio forensic applications. The team will leverage microphone calibration methods, statistical signal processing techniques for blind microphone fingerprint estimation, and system identification methods for linking an audio recording to a specific recording device. The second thrust aims to investigate the impact of anti-forensic attacks on existing forgery detectors and replay attacks on speaker recognition systems. The research team will design attack models to perturb the underlying forgery detection feature space and analyze performance of existing and new algorithms under these anti-forensics attacks. The third research effort will be focused on designing new audio forensic analysis algorithms robust to these and other emerging anti-forensic attacks. The team will use manipulation methods for anti-forensic attacks and a game-theory-based framework for attack-aware tamper detection and design new forensic methods based on findings of these activities. The fourth research thrust will aim at developing a first-of-its-kind research commons for audio forensics consisting of benchmarking datasets, algorithms, and tools. The team will collect audio from both controlled settings and crowdsourcing in the wild, and use known audio manipulation, editing, and anti-forensic techniques to generate tampered datasets. The team will design and deploy the benchmarking testbed, ForensicExaminer, using a micro services architecture.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 11)
Aljasem, Muteb and Irtaza, Aun and Malik, Hafiz and Saba, Noushin and Javed, Ali and Malik, Khalid Mahmood and Meharmohammadi, Mohammad "Secure Automatic Speaker Verification (SASV) System Through sm-ALTP Features and Asymmetric Bagging" IEEE Transactions on Information Forensics and Security , v.16 , 2021 https://doi.org/10.1109/TIFS.2021.3082303 Citation Details
Baumann, Roland and Malik, Khalid Mahmood and Javed, Ali and Ball, Andersen and Kujawa, Brandon and Malik, Hafiz "Voice spoofing detection corpus for single and multi-order audio replays" Computer Speech & Language , v.65 , 2021 https://doi.org/10.1016/j.csl.2020.101132 Citation Details
Cheek, Eric and Khuttan, Dhimant and Changalvala, Raghu and Malik, Hafiz "Physical Fingerprinting of Ultrasonic Sensors and Applications to Sensor Security" IEEE International Conference on Dependability in Sensor, Cloud, and Big Data Systems and Applications 2020 (IEEE DependSys20) , 2020 https://doi.org/10.1109/DependSys51298.2020.00018 Citation Details
Hafeez, Azeem and Khalid, Malik and Hafiz, Malik "Exploiting Frequency Response for the Identification of Microphone using Artificial Neural Networks" 2019 AES INTERNATIONAL CONFERENCE ON AUDIO FORENSICS (June 2019) , 2019 Citation Details
Hassani, Ali and Diedrich, Jon and Malik, Hafiz "Monocular Facial PresentationAttackDetection: Classifying Near-Infrared Reflectance Patterns" Applied Sciences , v.13 , 2023 https://doi.org/10.3390/app13031987 Citation Details
Hassani, Ali and Malik, Hafiz and Diedrich, Jon "Efficiently Mitigating Face-Swap-Attacks: Compressed-PRNU Verification with Sub-Zones" Technologies , v.10 , 2022 https://doi.org/10.3390/technologies10020046 Citation Details
Javed, Ali and Malik, Khalid Mahmood and Irtaza, Aun and Malik, Hafiz "Towards protecting cyber-physical and IoT systems from single- and multi-order voice spoofing attacks" Applied Acoustics , v.183 , 2021 https://doi.org/10.1016/j.apacoust.2021.108283 Citation Details
Malik, Hafiz "Fighting AI with AI: Fake Speech Detection using Deep Learning" 2019 AES INTERNATIONAL CONFERENCE ON AUDIO FORENSICS (June 2019) , 2019 Citation Details
Malik, Hafiz "Securing Voice-Driven Interfaces Against Fake (Cloned) Audio Attacks" 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) , 2019 10.1109/MIPR.2019.00104 Citation Details
Malik, Khalid Mahmood and Javed, Ali and Malik, Hafiz and Irtaza, Aun "A Light-Weight Replay Detection Framework For Voice Controlled IoT Devices" IEEE Journal of Selected Topics in Signal Processing , v.14 , 2020 10.1109/JSTSP.2020.2999828 Citation Details
Malik, Khalid Mahmood and Malik, Hafiz and Baumann, Roland "Towards Vulnerability Analysis of Voice-Driven Interfaces and Countermeasures for Replay Attacks" IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) , 2019 10.1109/MIPR.2019.00106 Citation Details
(Showing: 1 - 10 of 11)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Wide spread adpontion of high-powered smart-computing devices, like smartphones, has profoundly impacted various aspects of our lives, including law enforcement and legal proceedings. These devices enable us to create, edit, analyze, and share multimedia files, driving significant technological advancements. Nowadays, digital multimedia, such as audio, video, and images, has become the standard form of evidence in litigation and criminal justice cases. However, ensuring the authenticity and integrity of this media poses significant challenges. To be admissible as evidence, digital media must be proven authentic, meaning it provides a complete record of events since its creation. Additionally, its integrity must be verified, involving the detection of any disruptions or intentional manipulations in the recording, such as insertions or compressions. This verification process becomes even more difficult when there is a lack of supporting data, like digital watermarks, and when the media has been altered with anti-forensic intent. Furthermore, the availability of powerful and user-friendly digital editing, open-source availability of generative AI algorithms and manipulation tools has further complicated the authentication and integrity verification of digital audio files.

The goal of this project is to investigate microphone and acoustic environment signatures in the evidentiary recording, to investigate the impact of anti-forensic attacks on existing methods, to design robust attack-aware algorithms against anti-forensic attacks, and to develop a common platform for benchmarking audio forensics algorithms and tools. These objectives have been achieved through the creation and implementation of a following diverse range of methodologies, tools, and collaborative research platforms: 

  1. In this project, we have developed several novel methods to improve the capture of microphone signatures from voice replay, voice cloning, and the algorithms used for their synthesis. We have also focused on understanding the dynamic speech variations in genuine signals and distinguishing them from spoofed samples. 
  2. Additionally, our research has shed light on the vulnerabilities of automated speaker verification systems to AI-cloned voices and multi-order replay spoofing. Furthermore, we have explored the impact of deepfake, and recorded audio injected through LASER into MEMS microphones, and we have devised countermeasures to detect such attacks.
  3. Moreover, we have specifically crafted liveness detection techniques tailored for the purpose of speaker verification. 
  4. Unlike other single attack-specific countermeasures, we have developed two novel unified anti-spoofing methods that can handle various attacks on automated speaker verification using a single descriptor.
  5. The existing countermeasures against AI-generated voices are unable to reliably detect unseen samples or audio generated by new AI algorithms. Therefore, we have developed a novel spoofing transformer network called SpotNet. 
  6. Furthermore, we have developed a secure and lightweight automated speaker verification system that not only reliably authenticates speakers/users but also demonstrates resistance against multiple audio spoofing attacks. Moreover, this system possesses the capability to capture the signature of voice cloning algorithms. 
  7. In addition, we have developed deep learning-based methods for detecting both audio and video deepfakes by capturing temporal, spatial, and inter-domain inconsistencies. Furthermore, we have created the first-ever neuro-symbolic deepfake detection framework that leverages the observation that deepfakes often exhibit inter- or intra-modality inconsistencies in the emotional expressions of the manipulated individuals. 
  8. We have also developed a framework for securing voice-controlled devices/services, e.g., Amazon Alexa, Google Home, Apple Siri, etc., against emerging threat vectors including LASER injection  attacks.
  9. We have created a publicly available voice spoofing detection corpus (VSDC) that includes bonafide samples as well as first-order and second-order replay samples.
  10. We have developed a research commons platform that facilitates apple-to-apple comparative analysis of state-of-the-art voice spoofing countermeasures. 
  11. We have also investigated the impact of anti-forensic attacks on automatic speaker verification (ASV) and voice countermeasures. This investigation involved using voice samples obtained through facemasks and compressed audio samples, among other techniques.
  12. Additionally, we investigated the impact of adversarial attacks, including mole or other perturbation-based attacks, on audio-visual deepfake detection systems, and 
  13. We have developed an approach based on game theory to enhance the security of audio-visual deepfake detection systems

The intellectual merit of this project is its interdisciplinary approach that spans multiple domains, including cybersecurity, audio/speech processing, digital forensics, applied probability theory, and large-scale experimentation with audio data. At its core, this project involves the development of novel algorithms capable of accurately modeling and extracting the unique characteristics of acquisition devices, effectively addressing anti-forensic attacks, assessing system performance in the face of such attacks, designing forensic detectors that are resilient to these attacks, and curating comprehensive datasets for benchmarking and platform development purposes.

This project has made substantial contributions with broader impacts spanning multiple domains including multimedia forensics, content integrity verificaion, civil and criminal proceedings, national security, the fintech industry, law enforcement, cyberspace, voice-activated services, and the entertainment industry.


Last Modified: 06/21/2023
Modified by: Hafiz Malik

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page