text-only page produced automatically by LIFT Text Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation Home National Science Foundation - Computer & Information Science & Engineering (CISE)
Computing and Communication Foundations (CCF)
design element
CCF Home
About CCF
Funding Opportunities
Career Opportunities
See Additional CCF Resources
View CCF Staff
CISE Organizations
Advanced Cyberinfrastructure (ACI)
Computing and Communication Foundations (CCF)
Computer and Network Systems (CNS)
Information & Intelligent Systems (IIS)
Proposals and Awards
Proposal and Award Policies and Procedures Guide
Proposal Preparation and Submission
bullet Grant Proposal Guide
  bullet Grants.gov Application Guide
Award and Administration
bullet Award and Administration Guide
Award Conditions
Merit Review
NSF Outreach
Policy Office
Additional CCF Resources
Career Opportunities
Other Site Features
Special Reports
Research Overviews
Multimedia Gallery
Classroom Resources
NSF-Wide Investments

Email this pagePrint this page

Computer Program Reveals Anyone's Ancestry

Researchers develop computer algorithm that can trace the genetic ancestry of thousands of individuals in minutes

Plot of genetic markers and world map graphic.

This plot of genetic markers shows 255 individuals from four continental regions.
Credit and Larger Version

May 5, 2008

Imagine being adopted, with no understanding of your cultural or genetic background. You don't know your heritage or what diseases you are genetically predisposed to. Most of us have some idea about the roots of our family tree, but little understanding of what those lower branches mean in terms of our predisposition to a host of diseases and ailments.

Now, a group of computer scientists, mathematicians and biologists from around the world have developed a computer algorithm that can quickly trace an individual's genetic ancestry with only a small sample of their DNA. In fact, the program can trace the genetic ancestry of thousands of individuals in minutes, without any prior knowledge of their background.

The multi-disciplinary approach, published in the September 2007 edition of the journal PLoS Genetics, allowed the research team to address this type of research in a novel way. Unlike previous computer programs that required prior knowledge of an individual's ancestry and background, the new algorithm looks for specific DNA markers known as single nucleotide polymorphisms, or SNPs (pronounced snips), and needs nothing more than a DNA sample in the form of a simple cheek swab.

The researchers used genetic data from previous studies to perform and confirm their research, including the new HapMap database, which is being developed by an international partnership of scientists and others to uncover and map variations in the human genome.

This recent "work was an exciting opportunity to form an interdisciplinary team of computer scientists, mathematicians and human geneticists," said Petros Drineas, the senior author of the study and assistant professor of computer science at Rensselaer Polytechnic Institute.

"Now that we have found that the program works well, we hope to implement it on a much larger scale, using hundreds of thousands of SNPs and thousands of individuals," said Drineas, who was funded by a National Science Foundation (NSF) Faculty Early Career Development Program (CAREER) award. The teams' program "will be a valuable tool for understanding our genetic ancestry and targeting drugs and other medical treatments because it might be possible that these can affect people of different ancestry in very different ways."

Understanding our unique genetic makeup is a crucial step to unraveling the genetic basis for complex diseases. Although the human genome is 99 percent the same from human to human, it is that one percent that can have a major impact on our response to diseases, viruses, medications and toxins. If researchers can uncover the minute genetic details that set each of us apart, biomedical research and treatments can be better customized for each individual, Drineas said.

This program will help people understand their unique backgrounds, and aid historians and anthropologists in their studies of where different populations originated and how humans became such a hugely diverse, global society.

The program was more than 99 percent accurate in trials and correctly identified the ancestry of hundreds of individuals. This included people from genetically similar populations (such as Chinese and Japanese) and complex genetic populations, like Puerto Ricans who can come from a variety of backgrounds including Native American, European and African ancestries.

"When we compared our findings to the existing datasets, only one individual was incorrectly identified and his background was almost equally close between Chinese and Japanese," Drineas said. He went on to explain that the results are preliminary, but extremely promising. The team is now working to test its program on a much larger data set.

In addition to Drineas, the algorithm was developed by scientists from California, Puerto Rico and Greece. The researchers involved include lead author Peristera Paschou from the Democritus University of Thrace in Greece; Elad Ziv, Esteban G. Burchard and Shweta Choudhry from the University of California, San Francisco; William Rodriguez-Cintron from the University of Puerto Rico School of Medicine in San Juan, Puerto Rico; and Michael W. Mahoney from Yahoo! Research in California.

-- Gabrielle DeMarco, Rensselaer Polytechnic University DEMARG@rpi.edu

This Behind the Scenes article was provided to LiveScience in partnership with the National Science Foundation

Petros Drineas
Peristera Paschou
Elad Ziv
Esteban Burchard
Shweta Choudhry
William Rodriguez-Cintro
Michael Mahoney

Related Institutions/Organizations
Rensselaer Polytechnic Institute
Democritus University of Thrace
University of California, San Francisco
University of Puerto Rico School of Medicine
Yahoo! Research

New York
Puerto Rico

Related Programs
Faculty Early Career Development (CAREER) Program

Related Awards
#0545538 CAREER: A Framework for Mining Multimode, Non-Homogeneous Tensor Data Sets With Linear and Non-Linear Degrees of Freedom

Total Grants

Related Websites
LiveScience.com: Computer Program Reveals Anyone's Ancestry: http://www.livescience.com/health/080404-bts-drineas.html
Genetic Ancestry Tests Mostly Hype, Scientists Say: http://www.livescience.com/health/071018-vanity-tests.html
Genes: The Instruction Manuals for Life: http://www.livescience.com/health/060529_mm_genes.html

Illustration of DNA strand.
The human genome is 99 percent the same from human to human.
Credit and Larger Version

Photo of mother and child.
DNA from people with common ancestors have common characteristics that can be identified.
Credit and Larger Version

Photo of mother holding her baby.
DNA is passed along from parents to their children.
Credit and Larger Version

Email this pagePrint this page
Back to Top of page