Award Abstract # 0965436
Collaborative Research: New Methods to Enhance Our Understanding of the Diversity of Science

NSF Org: SMA
SBE Office of Multidisciplinary Activities
Recipient: UNIVERSITY OF MASSACHUSETTS
Initial Amendment Date: May 10, 2010
Latest Amendment Date: March 7, 2016
Award Number: 0965436
Award Instrument: Standard Grant
Program Manager: maryann feldman
SMA
 SBE Office of Multidisciplinary Activities
SBE
 Directorate for Social, Behavioral and Economic Sciences
Start Date: May 15, 2010
End Date: September 30, 2016 (Estimated)
Total Intended Award Amount: $286,940.00
Total Awarded Amount to Date: $286,940.00
Funds Obligated to Date: FY 2010 = $286,940.00
History of Investigator:
  • Hanna Wallach (Principal Investigator)
    wallach@cs.umass.edu
  • Andrew McCallum (Co-Principal Investigator)
  • Andrew McCallum (Former Principal Investigator)
  • Hanna Wallach (Former Co-Principal Investigator)
Recipient Sponsored Research Office: University of Massachusetts Amherst
101 COMMONWEALTH AVE
AMHERST
MA  US  01003-9252
(413)545-0698
Sponsor Congressional District: 02
Primary Place of Performance: University of Massachusetts Amherst
101 COMMONWEALTH AVE
AMHERST
MA  US  01003-9252
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): VGJHK59NMPK9
Parent UEI: VGJHK59NMPK9
NSF Program(s): SciSIP-Sci of Sci Innov Policy
Primary Program Source: 01001011DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 0000, OTHR
Program Element Code(s): 762600
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.075

ABSTRACT

This project focuses on the development and implementation of new quantitative methods to provide a deeper understanding of science policy interventions. By building analytic tools that capture the diversity of science, this project moves beyond existing methods that typically analyze the rate of scientific innovation. This move is an important next step in the "science of science policy" agenda:

Intellectual Merit: Although understanding of institutional changes on the rate of inventive activity has improved markedly in recent years, effective science policy interventions must also be grounded in an understanding of their impact on diversity as well as productivity, construed both in terms of idea diversity -- the array of different ideas derived from novel scientific insights -- and individual diversity -- the variety of people and organizations in social space engaged in scientific progress. To move forward with this crucial agenda requires a rich new set of tools. In developing such tools, this project extends prior work that focuses on "citation-counting," combining novel approaches from social and computer sciences to represent and analyze publication, patent and grant data in idea and social space. Specifically, the tools integrate two powerful methods: (a) statistical topic modeling and (b) social network analysis.

Broader Impact: These methods can also be extended to examine diversity across national, social and topic boundaries, thus providing quantitative tools to characterize issues of key significance in debates over national competitiveness. While these science policy questions could be addressed in a wide variety of settings, this project focuses on the varied data associated with the human genome and human genetics.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 11)
A. Handler, M. Denny, H. Wallach, and B. O'Connor "Bag of What? Simple Noun Phrase Extraction for Text Analysis" EMNLP Workshop on NLP and Computational Social Science , 2016
A. Schein, J. Paisley, D. Blei, and H. Wallach "Bayesian Poisson Tensor Factorization for Inferring Multilateral Relations from Sparse Dyadic Event Counts" 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2015
A. Schein, M. Zhou, D. Blei, and H. Wallach "Bayesian Poisson Tucker Decomposition" Thirty-Third International Conference on Machine Learning , 2016
A. Schein, M. Zhou, H. Wallach "Poisson--Gamma Dynamical Systems" Neural Information Processing Systems , 2016
E. Talley, D. Newman, D. Mimno, B. Herr II, H. Wallach, G. Burns, M. Leenders and A. McCallum "A Database of National Institutes of Health (NIH) Research Using Machine Learned Categories and Graphically Clustered Grant Awards" Nature Methods , 2011
F. Guo, C. Blundell, H. Wallach, and K. Heller "The Bayesian echo chamber: Modeling social influence via linguistic accommodation" International Conference on Artificial Intelligence and Statistics , 2015
Furman, JL; Murray, F; Stern, S "More for the research dollar" NATURE , v.468 , 2010 , p.757 View record at Web of Science
G. Zanella, B. Betancourt, H. Wallach, J. Miller, A. Zaidi, R. Steorts "Flexible Models for Microclustering with Application to Entity Resolution" Neural Information Processing Systems , 2016
Huang, KG; Murray, FE "Entrepreneurial experiments in science policy: Analyzing the Human Genome Project" RESEARCH POLICY , v.39 , 2010 , p.567 View record at Web of Science 10.1016/j.respol.2010.02.00
Mimno, D., Wallach, H., Talley, E., Leenders, M. and McCallum, A. "Optimizing Semantic Coherence in Topic Models" Proceedings of Empirial Methods in Natural Language Processing , 2011
Passos, A., Wallach, H. and McCallum, A. "Correlations and Anticorrelations in LDA Inference" NIPS 2011 Workshop on NIPS Workshop on Learning Challenges in Hierarchical Model. Granada, Spain, December 10-17, 2011 , 2011
(Showing: 1 - 10 of 11)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project focused on the development of analytical tools capable of
capturing the diversity of science. Our primary goal was to develop
and implement new methods, grounded in the computer science literature
(specifically statistical topic modeling and social network analysis),
for analyzing the impact of policy interventions on the diversity of
science. Our tools moved beyond traditional "citation-counting"
methods that focus only on the rate of scientific innovation.

We developed several new statistical topic modeling techniques for
analyzing scientific publications and patents. These include an
experimental framework for identify unlabeled similar document pairs,
new methods for quantifying and improving the quality or "coherence"
of topics produced by topic models, an investigation of correlations
and anti-correlations between inferred topics, a method for
pre-analysis phrase xdiscovery, and a method for post-analysis
discovery and presentation of topic-specific phrases.

We also focused on jointly modeling three aspects of scientific and
commercial communities: structure, content, and dynamics. We developed
new methods for analyzing the dynamics of information transfer between
entities, in order to quantify the ways in which these changes are
driven by "social" factors. Our methods draw on ideas from Bayesian
tensor factorization (shown to be effective for community detection in
social networks and topic discovery in text corpora) to form a unified
framework for modeling the structure, content, and dynamics of
communities. To complement this work, we also developed a new
general-purpose Bayesian admixture model for discovering and
visualizing topic-specific subnetworks in collaboration networks.

Finally, we explored issues surrounding fairness, accountability,
transparency, and ethics that arise when studying social processes,
such as those that underlie scientific publication and patenting.


Last Modified: 01/04/2017
Modified by: Hanna M Wallach

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page