
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | August 22, 2011 |
Latest Amendment Date: | August 22, 2011 |
Award Number: | 1117132 |
Award Instrument: | Standard Grant |
Program Manager: |
Maria Zemankova
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2011 |
End Date: | August 31, 2015 (Estimated) |
Total Intended Award Amount: | $499,995.00 |
Total Awarded Amount to Date: | $499,995.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
W5510 FRANKS MELVILLE MEMORIAL LIBRARY STONY BROOK NY US 11794-0001 (631)632-9949 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
W5510 FRANKS MELVILLE MEMORIAL LIBRARY STONY BROOK NY US 11794-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | GRAPHICS & VISUALIZATION |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The goal of this research project is to devise new visualization tools to help scientists gain insight from their high-dimensional data. High-dimensional data are observations with many attributes, on the order of 100s and more. Today's data are often inherently high-dimensional: DNA microarrays, financial tick-by-tick data, hyper-spectral imagery, just to name a few. The challenge in visualizing these data comes from the limited dimensionality of the screen. Traditional data visualization paradigms have inherent inabilities to fully map high-dimensional properties to a two-dimensional display without loss of inherent semantics, patterns or structure. This can lead to ambiguous and even misleading visualizations. To overcome this fundamental chasm, the display system developed in this project uses methods gleaned from illustrative design to communicate these elusive properties, derived from analysis in the high-dimensional data space. A second important motivation of this research is that this illustration-inspired approach are expected to produce visualizations that are easier to interpret and manipulate.
The overall theme of this work is to use information abstraction and illustrative mappings to improve display comprehensibility, reduce unnecessary complexity, and communicate high-dimensional data patterns more faithfully. The illustrative framework is driven by a two-pronged data analysis suite that uses filtering to create a data representation at multiple levels of scale and pattern classification to identify suitable appearance illustrations. Both of these analyses are performed in the native high-dimensional data space to preserve the original structures. Various illustrative styles are linked to visual semantics to provide an intuitive data display. The generality of our framework allows it to readily map to the three most prominent high-dimensional visualization platforms: space embeddings, parallel coordinates, and scatter plots. Illustrative visualization design and validation is carried out in collaboration with experts in Environmental Science and the Human Microbiome Project.
The system is designed to support domain scientists in knowledge discovery, but also appeal to casual users by supporting data analysis via illustrative design. The display looks more natural since it uses familiar graphics design paradigms to construct the illustrative visualizations. The project webpage (http://www.cs.sunysb.edu/~mueller/IllustratorND) provides information on ongoing progress, invites participation in user studies, and also provides some data analysis capabilities within a web-enabled version of the software. The project offers educational and research opportunities for students.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The goal of this research has been to devise new visualization tools to help scientists gain insight from their high-dimensional data. High-dimensional data are observations with many attributes, on the order of 100s and more. Today's data are often inherently high-dimensional: DNA microarrays, financial tick-by-tick data, hyper-spectral imagery, just to name a few. But even for data with a dozen or less attributes it can be difficult to appreciate the complex multivariate interactions the attributes have with one another. These types of data are commonplace in everyday life. Examples are the specifications of a camera or those of a car, the various criteria for selecting a college or your next vacation hotel, or the characteristics of a bottle of fine wine. Trying to find the most favorable camera, car, college, hotel, or wine in the presence of many conflicting features within a conventional table is often a hopeless attempt. It is where visualization can be of tremendous help.
However, visualizing high-dimensional (ND) data is challenging due to the limited dimensionality of the screen. Traditional data visualization paradigms have inherent inabilities to fully map high-dimensional properties to a two-dimensional display without loss of inherent semantics, patterns or structure. This can lead to ambiguous and even misleading visualizations. To overcome this fundamental chasm, the display systems developed in this project use methods gleaned from design and mapping to communicate these elusive properties, derived from analysis of the high-dimensional data. In the following we describe some of the visualization frameworks we have devised and created over the duration of this grant. Each tool emphasizes different characteristics of the data and some are dedicated to specific domain applications.
Data Context Map: An interactive map that allows an accurate visualization of the data items in the context of the data attributes. Via a set of sliders users can establish tunable decision boundaries for data item selection tasks in the presence of tradeoffs. For example, the Data Context Map for a set of colleges would visualize top universities with high tuition but only a minor athletic program close to the Academics and Tuition “city” but far away from the Athletics “city”.
Correlation Map: This interactive map focuses on correlations that may exist among the attributes. Attributes that are closely related, such as top speed and horsepower for a car dataset, will be plotted close to one another. The map also offers multi-scale semantic zooming that can achieve scalability for large numbers of variables and data.
The Visual Causality Analyst: Deriving causation from observational data can lead to spurious relations where the inferred cause and effect is really just coincidental. The Visual Causality Analyst provides a novel visual causal reasoning framework that allows users to apply their expertise, verify and edit causal links, and collaborate with the causal discovery algorithm to identify a valid causal network.
Stream Vis ND: This is a framework for the visualization of streaming multivariate data. It illustrates these time-varying multivariate data as a temporal similarity display which enables quick recognition of relationship...
Please report errors in award information by writing to: awardsearch@nsf.gov.