Award Abstract # 1813662
III: Small: Mirador: Explainable Computational Models for Recognizing and Understanding Controversial Topics Encountered Online

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF MASSACHUSETTS
Initial Amendment Date: August 13, 2018
Latest Amendment Date: August 13, 2018
Award Number: 1813662
Award Instrument: Standard Grant
Program Manager: Hector Munoz-Avila
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2018
End Date: August 31, 2022 (Estimated)
Total Intended Award Amount: $499,682.00
Total Awarded Amount to Date: $499,682.00
Funds Obligated to Date: FY 2018 = $499,682.00
History of Investigator:
  • James Allan (Principal Investigator)
    allan@cs.umass.edu
Recipient Sponsored Research Office: University of Massachusetts Amherst
101 COMMONWEALTH AVE
AMHERST
MA  US  01003-9252
(413)545-0698
Sponsor Congressional District: 02
Primary Place of Performance: University of Massachusetts Amherst
OGCA, 100 Venture Way, Ste. 201
Hadley
MA  US  01035-9450
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): VGJHK59NMPK9
Parent UEI: VGJHK59NMPK9
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7364, 7923
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project aims to develop algorithms and tools that allow a person to recognize that a web page or other document discusses one or more topics that are controversial -- that is, about which there is strong disagreement within some sizeable group of people. The project will develop algorithms and tools that explain the controversy surrounding the topic, identifying the populations that disagree, the stances that they take, and how those stances conflict with each other. The advances in these algorithms will broaden the research community's understanding of how discussions and disagreements on topics can be modeled computationally and how that resulting information can be conveyed to a general user. The project will assist people in critical evaluation of on-line material and help them understand why a page is educative or why it is not.

The aim of this project is to provide users with tools that illuminate the broader context of the topic or topics of a single page or document that someone finds. Previous work has shown that it is possible to recognize with reasonable accuracy that a document is part of a controversial topic, but that work is fragmented across different genres, demands more robust modeling and more thorough evaluation, and lacks explanatory power that can help a reader understand why and how a text is contentious. In this project, the researchers explore fundamental questions about how controversy can be modeled computationally so that it can be recognized "in the wild". The project also explores model variations that allow an algorithm to extract an explanation of the nature of the controversy. The project applies and extends text analysis and comparison techniques. It leverages powerful statistical language modeling methods as well as recent neural network (deep learning) approaches to represent text, its controversial nature, its stances, and their relationships, all extracted from Web pages and other documents. The modeling will be initially used offline to identify collections of topics known to be controversial and then adapt that collection by monitoring slowly-changing news sources and blog postings as well as ephemeral microblog sources of data to capture rapid changes in controversy. The researchers will make the resulting techniques available by providing an open-source example server.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Chowdhury, Tanya and Rahimi, Razieh and Allan, James "Equi-explanation Maps: Concise and Informative Global Summary Explanations" Proceedings of the 2022 ACM FAccT* Conference , 2022 https://doi.org/10.1145/3531146.3533112 Citation Details
Huang, Zhiqi and Rahimi, Razieh and Yu, Puxuan and Shang, Jingbo and Allan, James "AutoName: A Corpus-Based Set Naming Framework" Proceedings of The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 21) , 2021 https://doi.org/10.1145/3404835.3463100 Citation Details
Kim, Y. and "Unsupervised Explainable Controversy Detection from Online News" Proceedings of the European Conference on Information Retrieval , 2019 10.1007/978-3-030-15712-8_60 Citation Details
Kim, Youngwoo and Jang, Myungha and Allan, James "Explaining Text Matching on Neural Natural Language Inference" ACM Transactions on Information Systems , v.38 , 2020 https://doi.org/10.1145/3418052 Citation Details
Kim, Youngwoo and Rahimi, Razieh and Allan, James "Alignment Rationale for Query-Document Relevance" Proceedings of The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 22) , 2022 https://doi.org/10.1145/3477495.3531883 Citation Details
Kim, Youngwoo and Rahimi, Razieh and Bonab, Hamed and Allan, James "Query-driven Segment Selection for Ranking Long Documents" Proceedings of The 30th ACM International Conference on Information and Knowledge Management (CIKM '21) , 2021 https://doi.org/10.1145/3459637.3482101 Citation Details
Ramezani, S. and Rahimi, R. and Allan, J. "Aspect Category Detection in Product Reviews using Contextual Representation" Proceedings of ACM SIGIR Workshop on eCommerce (SIGIR eCom20) , 2020 https://doi.org/ Citation Details
Sarwar, Sheikh Muhammad and Moraes, Felipe and Jiang, Jiepu and Allan, James "Utility of Missing Concepts in Query-biased Summarization" Proceedings of The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 21) , 2021 https://doi.org/10.1145/3404835.3463121 Citation Details
Yu, Puxuan and Rahimi, Razieh and Huang, Zhiqi and Allan, James "Learning to Rank Entities for Set Expansion from Unstructured Data" Proceedings of the ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2020) , 2020 https://doi.org/10.1145/3409256.3409811 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

A challenge faced by nearly every user of the web is how to know whether the text they are reading is accurate or is part of a large and controversial discussion. Attempts to address this problem focus on whether the author or source is authoritative, whether the language inappropriate, or similar high-level clues.

As an addition to those approaches, we have focused on explanation approaches that highlight differences between claims in one document and those in other documents -- ideally more authoritative statements of evidence. We developed several types of models: some that captured the relationship between documents and the query that retrieved them, some that extracted the key topics and subtopics of the documents to show the connections between different documents with different perspectives, and some that explicitly captured the alignment between words in different spans of text to explicitly highlight the disagreement across sentences. We created training and test collections of data that has been freely shared with the research community. We used that data to measure the effectiveness of our various approaches, finding that they generally outperformed existing approaches adapted to these problems. This grant supported the training of five PhD students (two of whom are female). It resulted in 9 publications, contributing to the theses of the students.


Last Modified: 01/09/2023
Modified by: James M Allan

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page