text-only page produced automatically by LIFT Text Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation Home National Science Foundation - Computer & Information Science & Engineering (CISE)
Computing and Communication Foundations (CCF)
design element
CCF Home
About CCF
Funding Opportunities
Awards
News
Events
Discoveries
Publications
Career Opportunities
View CCF Staff
CISE Organizations
Advanced Cyberinfrastructure (ACI)
Computing and Communication Foundations (CCF)
Computer and Network Systems (CNS)
Information & Intelligent Systems (IIS)
Proposals and Awards
Proposal and Award Policies and Procedures Guide
  Introduction
Proposal Preparation and Submission
bullet Grant Proposal Guide
  bullet Grants.gov Application Guide
Award and Administration
bullet Award and Administration Guide
Award Conditions
Other Types of Proposals
Merit Review
NSF Outreach
Policy Office
Other Site Features
Special Reports
Research Overviews
Multimedia Gallery
Classroom Resources
NSF-Wide Investments

Email this pagePrint this page


Press Release 11-264
Tool Enables Scientists to Uncover Patterns in Vast Data Sets

Relationships discovered in data will shed light on vexing problems and increase human understanding

Illustration showing humans evolving to cope with growing data sets.

Evolving to cope with growing data sets, a pictoral representation.
Credit and Larger Version

December 15, 2011

With support from the National Science Foundation, researchers from the Broad Institute and Harvard University recently developed a tool that can uncover patterns in large data sets in a way that no other software program can.

Called Maximal Information Coefficient or MIC, the tool can can tease out multiple, recurring events or sets of data hidden in health information from around the globe, or in the changing bacterial landscape of the gut or even in statistics amassed from a season of competitive sports--and much more. The researchers report their findings in the Dec. 16th issue of the journal Science.

Part of a suite of statistical tools called MINE for Maximal Information-based Nonparametric Exploration, MIC has the ability to sort through today's mass of research variables--from attempts to track hurricanes, efforts to model earthquakes, endeavors to identify the Higgs Boson and efforts to glean insights from affecting the world economy and social networking interaction.

Researchers currently use advanced technology to gather big, complex, data sets, which may be incredibly useful in enhancing system understanding, if, in fact, vast amounts of data can be organized so that telling information may be extracted. Sophisticated computer programs research these data sets with great speed, but fall short in even-handedly detecting different kinds of patterns in large data collections, essential for more sophisticated analysis.

One of the greatest strengths of this newly discovered tool within MINE is its ability to detect and analyze a broad spectrum of patterns and characterize them according to a number of different parameters a researcher might be interested in. Other statistical tools work well for searching for a specific pattern in a large data set, but cannot score and compare different kinds of possible relationships. Researchers can also use MINE to generate new ideas and connections.

Learn more about MINE, MIC and patterns identified in biological and health data, as well as statistics from the 2008 baseball season by visiting the Broad Institute website. A video about this work also is available on the website.

-NSF-

Media Contacts
Lisa-Joy Zgorski, NSF, (703) 292-8311, lisajoy@nsf.gov
Haley Bridger, Broad Institute of MIT and Harvard, (617) 714-7968, hbridger@broadinstitute.org

Related Websites
Broad Institute of MIT and Harvard: http://www.broadinstitute.org/
MINE program website: http://www.exploredata.net/
Video about this new tool: http://www.broadinstitute.org/node/3783/

The National Science Foundation (NSF) is an independent federal agency that supports fundamental research and education across all fields of science and engineering. In fiscal year (FY) 2014, its budget is $7.2 billion. NSF funds reach all 50 states through grants to nearly 2,000 colleges, universities and other institutions. Each year, NSF receives about 50,000 competitive requests for funding, and makes about 11,500 new funding awards. NSF also awards about $593 million in professional and service contracts yearly.

 Get News Updates by Email 

Useful NSF Web Sites:
NSF Home Page: http://www.nsf.gov
NSF News: http://www.nsf.gov/news/
For the News Media: http://www.nsf.gov/news/newsroom.jsp
Science and Engineering Statistics: http://www.nsf.gov/statistics/
Awards Searches: http://www.nsf.gov/awardsearch/

 

Photo of David and Yakir Reshef who developed MIC.
David and Yakir Reshef developed MIC with guidance from Harvard and Broad Institute professors.
Credit and Larger Version

Illustration of a stack of paper rising above the city skyline.
Research examining abundance levels of bacteria in the human gut produced an abundance of data.
Credit and Larger Version

Cover of the December 16, 2011 issue of the journal Science.
The researchers' work is described in the December 16, 2011 issue of the journal Science.
Credit and Larger Version



Email this pagePrint this page
Back to Top of page