Solution Found to Long-Standing Inconsistencies in Data Analysis
This material is available primarily for archival purposes. Telephone numbers or other contact information may be out of date; please see current contact information at media contacts.
Final Exam Question: 40% of children in a high school participated in a special college preparation program, and 40% of students from that high school went on to college. For 50 bonus points, what fraction of participants in the college prep program went on to college?
This is a trick question. Until now, the only way to be sure of the answer would be to violate confidentiality laws and track down the individual students.
Now, a National Science Foundation (NSF)-supported political scientist has a solution to a long-standing, consequential problem in social science methodology: how to learn about the behavior of individuals when the only information available is on groups.
The solution may have been found in a statistical method developed by Gary King, professor of government at Harvard University. His new algorithm for computer software is reported in a recently published book by Princeton University Press, A Solution To The Ecological Inference Problem: Reconstructing Individual Behavior From Aggregate Data.
King's new method may have a significant impact on a range of research problems, such as epidemiological studies of radon and lung cancer, market research on consumer behavior and implementation of the Voting Rights Act. The American Political Science Association has selected King to receive its Gosnell Award "for the best methodological work in political science in 1995-96" for his research on this subject.
"I expect Gary King's solution will contribute to the production of more accurate, insightful data analysis in a variety of research studies, leading to more informed policy-making and better understanding of our economy and society," Frank Scioli, director of NSF's political science research program, says.
Inferring individual behavior from statistics recorded about groups, known as the "ecological inference problem," was originally posed over 75 years ago. It was the first statistical problem encountered in the new field of political science. Scholars soon recognized the same problem in numerous other areas, and since then researchers have pursued a solution.
"Ecological inference is required whenever surveys are unavailable, unreliable or too expensive," says King. "Surveys cannot address most historical questions unless they are conducted then and there. They are also unreliable for studying controversial issues, such as racial politics, since respondents do not always report their opinions and behaviors accurately."
The ecological inference problem was originally raised in 1919 by scholars seeking to know how women, who were about to have the vote nationwide, would decide to cast their ballots. Although women had voted in some state elections, and these data were available, the secret ballot and the ecological inference problem prevented analysts from distinguishing the votes of women from the remaining (male) votes in the same electoral precincts.
The United States and other governments produce enormous quantities of statistical data on aggregates such as towns, cities, congressional districts and census blocks. A solution to the ecological inference problem will give researchers and public policy makers the ability to better analyze data and learn about individual behavior.
King tested his method with data sets of groups for which the individual behaviors were known. He made more than 16,000 comparisons between his estimates and the known individuals' behaviors. NSF provided the support to gather the data and to develop methods for its analysis.
Applications of the Ecological Inference Solution
Several research areas may benefit from the ecological inference solution developed by Gary King, professor of government at Harvard University, with the support of the National Science Foundation.
For more information on his research, see http://gking.harvard.edu.
The National Science Foundation (NSF) is an independent federal agency that supports fundamental research and education across all fields of science and engineering. In fiscal year (FY) 2016, its budget is $7.5 billion. NSF funds reach all 50 states through grants to nearly 2,000 colleges, universities and other institutions. Each year, NSF receives more than 48,000 competitive proposals for funding and makes about 12,000 new funding awards. NSF also awards about $626 million in professional and service contracts yearly.
Useful NSF Web Sites: