
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 14, 2017 |
Latest Amendment Date: | June 27, 2018 |
Award Number: | 1742702 |
Award Instrument: | Standard Grant |
Program Manager: |
Sara Kiesler
skiesler@nsf.gov (703)292-8643 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2017 |
End Date: | August 31, 2020 (Estimated) |
Total Intended Award Amount: | $300,000.00 |
Total Awarded Amount to Date: | $316,000.00 |
Funds Obligated to Date: |
FY 2018 = $16,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
201 OLD MAIN UNIVERSITY PARK PA US 16802-1503 (814)865-1372 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
PA US 16802-7000 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Secure &Trustworthy Cyberspace |
Primary Program Source: |
01001819DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Awareness of misinformation online is becoming an increasingly important issue, especially when information is presented in the format of a news story, because (a) people may over-trust content that looks like news and fail to critically evaluate it, and (b) such stories can be easily spread, amplifying the effect of misinformation. Using machine learning methods to analyze a large database of articles labeled as more or less likely to contain misinformation, along with theoretical analyses from the fields of communication, psychology, and information science, the project team will first characterize what distinguishes stories that are likely to contain misinformation from others. These characteristics will be used to build a tool that calls out characteristics of a given article that are known to correlate with misinformation; they will also be used to develop training materials to help people make these judgments. The tool and training materials will be tested through a series of experiments in which articles are evaluated by the tool and by people both before and after undergoing training. The goal is to have a positive impact on online discourse by improving both readers' and moderators' ability to reduce the impact of misinformation campaigns. The team will make the models, tools, and training materials publicly available for others to use in research, in classes, and online.
The team will use two main approaches to characterize articles that are more likely to contain misinformation. The first is a concept explication approach from the social sciences based on a deep analysis of research writing around information dissemination and evaluation. The second is a supervised machine learning approach to be trained on large datasets of labeled articles, including verified examples of misinformation. Both approaches will consider characteristics of the content; of its visual presentation; of the people who create, consume, and share it; and of the networks it moves through. These models will be translated into a set of weighted rules that combine the insights from the two approaches, then instantiated in Markov Logic Networks. These leverage the strengths of both first order logic and probabilistic graphic models, allow for a variety of efficient inference methods, and have been applied to a number of related problems; the models will be evaluated offline against test data using standard machine learning techniques. Finally, the team will develop training materials based on existing work from the International Federation of Library Associations and Institutions and on heuristic guidelines derived from the modeling work in the first two tasks, evaluate them through the experiments described earlier, and disseminate them online along with the developed models.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The outcomes of this EAGER project include: (1) the improved understanding on the wide spectrum and multiple sub-types of misinformation therein via a concept explication from social science perspective, (2) the identification and validation of computational and operational features of the identified sub-types of misinformation (e.g., clickbait, propaganda, native advertisement, fake news) across genres (e.g., politics, health), (3) the design and development of machine learning based solutions to detect sub-types of misinformation accurately, (4) the development of foundational techniques to be able to explain why a piece of information is true or fake using user comments or counter-examples, (5) the improved understanding on people’s susceptibility toward misinformation, and people’s ability to discriminate misinformation from true news, (6) the demonstration of a possibility for machine learning based detection models to get attacked by adversaries and forced to make wrong predictions on the veracity of news, and (7) suggestions on potential defense toward such attacks for machine learning based detection models.
The project has supported and trained three Ph.D. students (one of them graduated and joined Michigan State University as a faculty member), three REU students (two of them graduated and joined graduate schools at CMU and UT Austin, respectively), and three undergraduate students (all of them graduated and joined the industry). The project has also contributed to developing public benchmark dataset, FakeNewsNet, to evaluate and compare the performance of machine learning based misinformation detection, which has become the most widely-used dataset in the research community.
Last Modified: 02/03/2021
Modified by: Dongwon Lee
Please report errors in award information by writing to: awardsearch@nsf.gov.