Award Abstract # 1459300
CRII: RI: Alignment in Web-Forum Discourse: Computational Models of Adaptation and Language Change

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: THE PENNSYLVANIA STATE UNIVERSITY
Initial Amendment Date: May 6, 2015
Latest Amendment Date: May 6, 2015
Award Number: 1459300
Award Instrument: Standard Grant
Program Manager: D. Langendoen
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: May 1, 2015
End Date: April 30, 2019 (Estimated)
Total Intended Award Amount: $174,485.00
Total Awarded Amount to Date: $174,485.00
Funds Obligated to Date: FY 2015 = $174,485.00
History of Investigator:
  • David Reitter (Principal Investigator)
Recipient Sponsored Research Office: Pennsylvania State Univ University Park
201 OLD MAIN
UNIVERSITY PARK
PA  US  16802-1503
(814)865-1372
Sponsor Congressional District: 15
Primary Place of Performance: Pennsylvania State Univ University Park
316D IST Building
University Park
PA  US  16802-6823
Primary Place of Performance
Congressional District:
Unique Entity Identifier (UEI): NPM2J7MSCF61
Parent UEI:
NSF Program(s): Robust Intelligence
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7495, 8228
Program Element Code(s): 749500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Language use in real-world dialogue happens in context. Linguistic choices depend on previous ones: for example, the chosen words and sentence structures tend to mirror what was used previously by a conversation partner. This subtle adaptation process has been called "alignment". Alignment appears to help people understand each other in dialogue, and it seems to extend to human-computer interfaces, too. The concrete functions of alignment in dialogue are, however, unclear. Is it merely a useful epiphenomenon of how human memory works? Does it serve as a social or communicative signal? Is it indicative of a person's empathy? Does it help communities find a common language over long periods of time? Recent work has established that one of the consequences of alignment is persistent language change in the individual. There also is preliminary evidence that over time, groups of people talking to one another will converge in their choice of words and sentence structure. In other words, they find a common language. The project will devise computational models that describe and quantify these processes. With these, one can detect them in actual language use, such as in web-forums. In fact, the project will use big datasets from decades of web-forum messages to produce those models. The computational models will explain and predict processes in a way that makes them exploitable in modern social networks as well as for data science. Consider the example of a web-forum that connects those suffering from a disease so they can lend each other emotional and informational support. The models can detect and predict which messages in this web-forum are most supportive on the intended level, and whether they align to the person asking a question. A possible application of this may improve web-forum discourse by prioritizing search results and by making reading suggestions. Alignment models may also improve analysis techniques for large datasets by spotting networks of mutual supporters.

Models will be created in order to describe and explain alignment and language change in natural-language dialogue. The models will be computational and statistical to allow for exploitation of interactive alignment in natural-language dialogue as a feature in social network applications. Statistical alignment models describe language change over time as a function of variables that characterize the individual's behavior, memory, and of network information. These models will be fitted to longitudinal datasets derived from web-based, topic-oriented conversation threads. At the individual level, they will help refine cognitive-computational models of memory function in language production, which will be constrained by the well-validated ACT-R framework. The viability of the approach is supported by preliminary work on corpus-based syntactic priming and ACT-R models of language production, and pilot experiments showing alignment in the corpus. The outcomes of the project may point to novel methods of prioritizing and filtering the most helpful content and can address quality of life and well-being of patients such as those of the peer-support community whose conversations were studied in the investigator's work motivating the proposal.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Cole, Jeremy R. and Reitter, David "The {Role} of {Working} {Memory} in {Syntactic} {Sentence} {Realization}: {A} modeling \& simulation approach" Cognitive Systems Research , v.55 , 2019 , p.95--106 10.1016/j.cogsys.2019.01.001
Cole, J., Ghafurian, M., & Reitter, D. "Is Word Adoption a Grassroots Process? An Analysis of Reddit Communities." Social, Cultural, and Behavioral Modeling , 2017 , p.236 10.1007/978-3-319-60240-0_28
Cole, J. R., & Reitter, D. "The timing of lexical memory retrievals in language production." Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , v.1: Long , 2018 , p.2017 https://doi.org/10.18653/v1/N18-1183
Cole, J., Xu, Y., & Reitter, D. "How people talk about armed conflicts: An analysis of Reddit data." Social, Cultural, and Behavioral Modeling , 2016
Xu and Reitter "Convergence of Syntactic Complexity in Conversation." Proc. 64th Annual Mtg. of the Association for Computational Linguistics , 2016 , p.443
Xu and Reitter "Entropy converges between dialogue participants: explanations from an information-theoretic perspective." Proc. 64th Annual Mtg. of the Association for Computational Linguistics , 2016 , p.537
Xu and Reitter "Entropy in conversation: Towards an information-theoretic model of dialogue" Cognition , 2018
Xu and Reitter "Spectral Analysis of Information Density in Dialogue Predicts Collaborative Task Performance" Proc. 65th Annual Mtg. of the Association for Computational Linguistics , 2017
Xu, Yang and Reitter, David "Information density converges in dialogue: {Towards} an information-theoretic model" Cognition , 2018 , p.147--163 10.1016/j.cognition.2017.09.018
Xu, Y., Cole, J. R., & Reitter, D. "Not that much power: Linguistic alignment is influenced more by low-level linguistic features rather than social power" Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics , 2018

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The project helped us understand how people strategically distribute information across their conversations.  We studied how these strategies can make conversation more or less effective, leading to successful outcomes in collaborative tasks.  Using existing datasets in English and Danish (of task-oriented dialogue), we identified features that analyze the periodic changes in information density between two interlocutors.  Information density is defined as how well a computer model can predict words in each sentence, given the words that were used up to that point: the level of "surprisal" experienced when hearing a word is indicative of information.  We find that in successful communication, interlocutors take turns in contributing information, and that increasing information by an interlocutor responding to a topic can indicate effective understanding of what is conveyed.  This project developed computer models that can automatically analyze this and, thus, automatically determine how effective a given conversation is.  The project also examined adaptation effects in conversations. In addition to the prepared English and Danish datasets, we used dialogue datasets acquired from internet forums, which proved to be a rich source of language data.

The work under this grant identified important features of effective communication, which is now being applied to develop industrial human-computer dialogue systems.

 


Last Modified: 09/30/2019
Modified by: David T Reitter

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page