Award Abstract # 2211939
HCC: Medium: Improving data visualization and analysis tools to support reasoning about analysis assumptions

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: NORTHWESTERN UNIVERSITY
Initial Amendment Date: August 24, 2022
Latest Amendment Date: August 24, 2022
Award Number: 2211939
Award Instrument: Standard Grant
Program Manager: Han-Wei Shen
hshen@nsf.gov
 (703)292-2533
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2022
End Date: September 30, 2026 (Estimated)
Total Intended Award Amount: $1,194,588.00
Total Awarded Amount to Date: $1,194,588.00
Funds Obligated to Date: FY 2022 = $1,194,588.00
History of Investigator:
  • Jessica Hullman (Principal Investigator)
    jhullman@northwestern.edu
  • Matthew Kay (Co-Principal Investigator)
Recipient Sponsored Research Office: Northwestern University
633 CLARK ST
EVANSTON
IL  US  60208-0001
(312)503-7955
Sponsor Congressional District: 09
Primary Place of Performance: Northwestern University
2145 Sheridan Road
Evanston
IL  US  60208-3106
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): EXZVPWZBLUE8
Parent UEI:
NSF Program(s): HCC-Human-Centered Computing
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7367, 7924
Program Element Code(s): 736700
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Using statistics to model data often requires making assumptions about what the data represent, including the nature of underlying patterns and errors. For example, an analyst interested in the relationship between wealth and age might remove extremely low or high reported ages from her data as outliers, and choose to model the remaining data using a linear model, which implies that age increases wealth by some constant factor. Her results could imply substantially different conclusions about how wealth and age relate compared to an analysis that made different decisions about which data to include. One way for analysts to account for such sensitivity is by reporting the results of many reasonable analyses given a dataset and questions they want to answer using the data. Unfortunately, existing data analysis and visualization tools offer little support for reasoning about such a ?multiverse analysis.? They provide limited support for comparison of multiple models and visualizations that make different choices, and even less support in helping analysts reason about and express those choices. Further, there are few known ways to effectively convey both uncertainty in the results of a given analysis and uncertainty related to the assumptions made in that analysis. This project?s goal is to improve multiverse analysis: to better understand how analysts currently think about multiverse analysis, and to identify needs, opportunities, and approaches to help analysts use multiverse analyses.

This project addresses these challenges to expressing hard-to-quantify uncertainty related to analysis choices by creating new methods and tools to help analysts define, reason about, and express multiple alternative ways they could analyze their data. The research will focus on two common types of tools analysts use: visual analysis software that makes it easy to plot and compare data, and computational notebooks that allow for more seamless integration of code and narrative commentary. The project team will develop new user interfaces and programming libraries to elicit analysts? knowledge, as well as new visual representations and interaction techniques by which an analyst can compare between alternative models or analysis paths. The project will also produce novel software infrastructure to make conducting and evaluating multiple analyses feasible within existing tools and workflows. Further, the team will develop ways to better communicate multiverse analyses: ways to make multiverse analysis reports shareable, interactive documents that contain both the analysis code and figures as well as narrative context, and empirical results describing how different representations of plausible analyses impact readers? understanding. These research activities will be guided by the results of formative studies with real-world analysts that will address gaps in existing knowledge about the difficulties analysts face in defining and reasoning about alternative models or analysis steps they could have taken. All study results and computational tools will be made freely and publicly available.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Guo, Ziyang and Wu, Yifan and Hartline, Jason D and Hullman, Jessica "A Decision Theoretic Framework for Measuring AI Reliance" , 2024 https://doi.org/10.1145/3630106.3658901 Citation Details
Kale, Alex and Guo, Ziyang and Qiao, Xiao-li and Heer, Jeffrey and Hullman, Jessica "EVM: Incorporating Model Checking into Exploratory Visual Analysis" IEEE transactions on visualization and computer graphics , 2023 Citation Details
Sarma, A and Pu, X and Cui, Y and Correll, M and Brown, E and Kay, M "Odds and Insights: Decision Quality in Exploratory Data Analysis Under Uncertainty" , 2024 Citation Details
Sarma, Abhraneel and Hwang, Kyle and Hullman, Jessica and Kay, Matthew "Milliways: Taming Multiverses through Principled Evaluation of Data Analysis Paths" , 2024 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page