Award Abstract # 0910812
FutureGrid: An Experimental, High-Performance Grid Test-bed

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: TRUSTEES OF INDIANA UNIVERSITY
Initial Amendment Date: September 4, 2009
Latest Amendment Date: March 25, 2013
Award Number: 0910812
Award Instrument: Cooperative Agreement
Program Manager: Robert Chadduck
rchadduc@nsf.gov
 (703)292-2247
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2009
End Date: September 30, 2014 (Estimated)
Total Intended Award Amount: $10,100,000.00
Total Awarded Amount to Date: $10,133,500.00
Funds Obligated to Date: FY 2009 = $4,025,000.00
FY 2010 = $6,093,500.00

FY 2012 = $15,000.00
History of Investigator:
  • Geoffrey Fox (Principal Investigator)
    vxj6mb@virginia.edu
  • Jose Fortes (Co-Principal Investigator)
  • Andrew Grimshaw (Co-Principal Investigator)
  • Katarzyna Keahey (Co-Principal Investigator)
  • Warren Smith (Co-Principal Investigator)
Recipient Sponsored Research Office: Indiana University
107 S INDIANA AVE
BLOOMINGTON
IN  US  47405-7000
(317)278-3473
Sponsor Congressional District: 09
Primary Place of Performance: Indiana University
107 S INDIANA AVE
BLOOMINGTON
IN  US  47405-7000
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): YH86RTW2YVJ4
Parent UEI:
NSF Program(s): CYBERINFRASTRUCTURE,
Innovative HPC
Primary Program Source: 01001213DB NSF RESEARCH & RELATED ACTIVIT
01000910DB NSF RESEARCH & RELATED ACTIVIT

01001112DB NSF RESEARCH & RELATED ACTIVIT

01001011DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7476, 7619, 9251, 9215, HPCC
Program Element Code(s): 723100, 761900
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project provides a capability that makes it possible for researchers to tackle complex research challenges in computer science related to the use and security of grids and clouds. These include topics ranging from authentication, authorization, scheduling, virtualization, middleware design, interface design and cybersecurity, to the optimization of grid-enabled and cloud-enabled computational schemes for researchers in astronomy, chemistry, biology, engineering, atmospheric science and epidemiology. The project team will provide a significant new experimental computing grid and cloud test-bed, named FutureGrid, to the research community, together with user support for third-party researchers conducting experiments on FutureGrid.

The test-bed will make it possible for researchers to conduct experiments by submitting an experiment ?plan? that is then executed via a sophisticated workflow engine, preserving the provenance and state information necessary to allow reproducibility.

The test-bed includes a geographically distributed set of heterogeneous computing systems, a data management system that will hold both metadata and a growing library of software images, and a dedicated network allowing isolatable, secure experiments. The test-bed will support virtual machine-based environments, as well as native operating systems for experiments aimed at minimizing overhead and maximizing performance. The project partners will integrate existing open-source software packages to create an easy-to-use software environment that supports the instantiation, execution and recording of grid and cloud computing experiments.

One of the goals of the project is to understand the behavior and utility of cloud computing approaches. Researchers will be able to measure the overhead of cloud technology by requesting linked experiments on both virtual and bare-metal systems. FutureGrid will enable US scientists to develop and test new approaches to parallel, grid and cloud computing, and compare and collaborate with international efforts in this area. The FutureGrid project will provide an experimental platform that accommodates batch, grid and cloud computing, allowing researchers to attack a range of research questions associated with optimizing, integrating and scheduling the different service models. The FutureGrid also provides a test-bed for middleware development and, because of its private network, allows middleware researchers to do controlled experiments under different network conditions and to test approaches to middleware that include direct interaction with the network control layer. Another component of the project is the development of benchmarks appropriate for grid computing, including workflow-based benchmarks derived from applications in astronomy, bioinformatics, seismology and physics.

The FutureGrid will form part of NSF's TeraGrid high-performance cyberinfrastructure. It will increase the capability of the TeraGrid to support innovative computer science research requiring access to lower levels of the grid software stack, the networking software stack, and to virtualization and workflow orchestration tools. Full integration into the TeraGrid is anticipated by 1st October 2011.

Education and broader outreach activities include the dissemination of curricular materials on the use of FutureGrid, pre-packaged FutureGrid virtual machines configured for particular course modules, and educational modules based on virtual appliance networks and social networking technologies that will focus on education in networking, parallel computing, virtualization and distributed computing. The project will advance education and training in distributed computing at academic institutions with less diverse computational resources. It will do this through the development of instructional resources that include preconfigured environments that provide students with sandboxed virtual clusters. These can be used for teaching courses in parallel, cloud, and grid computing. Such resources will also provide academic institutions with a simple opportunity to experiment with cloud technology to see if such technology can enhance their campus resources. The FutureGrid project leverages the fruits of several software development projects funded by the National Science Foundation and the Department of Energy.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Larsson, P., and P. M. Kasson "Lipid Tail Protrusion in Simulations Predicts Fusogenic Activity of Influenza Fusion Peptide Mutants and Conformational Models" PLoS Comput Biol , v.9 , 2013 10.1371/journal.pcbi.1002950
Bosin, A. "A SOA-based model for the integrated provisioning of cloud and grid resources" Advances in Software Engineering , v.2013 , 2013 1687-8655
Bosin, A. "Resource Provisioning for e-Science Environments" International Journal of Grid and High Performance Computing , v.5 , 2013 , p.1 1938-0267
Fox, G. "Robust Scalable Visualized Clustering in Vector and non Vector Semimetric Spaces" Parallel Processing Letters , 2013
Kwan, J. C., M. S. Donia, A. W. Han, E. Hirose, M. G. Haygood, and E. W. Schmidt "Genome streamlining and chemical defense in a coral reef symbiosis" Proceedings of the National Academy of Sciences , 2013 1091-6490
Lin, Z., M. M. Zachariah, L. Marett, R. W. Hughen, R. W. Teichert, G. P. Concepcion, M. G. Haygood, B. M. Olivera, A. R. Light, and E. W. Schmidt "Griseorhodins D?F, Neuroactive Intermediates and End Products of Post-PKS Tailoring Modification in Griseorhodin Biosynthesis" Journal of Natural Products , v.77 , 2014 , p.1224
Marshall, P., Keahey K., Freeman, T "Elastic Site: Using Clouds to Elastically Extend Site Resources" IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), Melbourne, Australia. May 2010 , 2010
Nelson, J. "Analyzing PAPI Performance on Virtual Machines" VMware Technical Journal , v.Winter , 2013
S, S., and A. Basu "Performance of Eucalyptus and OpenStack Clouds on FutureGrid" International Journal of Computer Applications , v.80 , 2013
Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, and Geoffrey Fox "Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications" Proceedings of Emerging Computational Methods for the Life Sciences Workshop of ACM , 2010

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

FutureGrid Summary

FutureGrid was a national-scale Grid, Cloud and HPC computing test-bed service of modest size that includes a number of computational resources at five distributed locations. FutureGrid experience and architecture wa built around software defined systems at all levels of the stack shown in figure 1 - encompassing VM and bare-metal infrastructure, networks and application, systems and platform software – with a unifying goal of providing Computing Testbeds as a Service. FutureGrid systems totaled 4704 cores divided into distributed general purpose clusters at Chicago, Florida, IU and Texas; a Cray XT5m at IU and four small specialized clusters supporting SSD (at SDSC), Large Disk Large memory (at IU) and general purpose GPU’s (IU). FutureGrid’s system model grew in sophistication and ultimately supported software-defined systems – encompassing virtualized and bare-metal infrastructure, networks, application, systems and platform software – with a unifying goal of providing Computing Testbeds as a Service (CTaaS). FutureGrid's software system Cloudmesh aggregates resources not only from FutureGrid, but also from OpenCirrus, Amazon, Microsoft Azure, and HP Cloud and GENI resources. Cloudmesh was originally developed in order to simplify the execution of multiple concurrent experiments on a federated cloud infrastructure and in addition to virtual resources, FutureGrid exposed bare-metal provisioning to users. The importance of DevOps tools like Cloudmesh and close integration of Software and Systems administration staff were important lessons from FutureGrid. During its operation, FutureGrid supported 417 projects with 2601 registered users.

The six figures show the Computing Testbed as a Service architecture; the distributed network that was FutureGrid superimposed on a map of the USA; the Cloud, Infrastructure, Grid, HPC and Testbed services offered by FutureGrid; a mosaic of FutureGrid machines; four snapshots of FutureGrid monitoring capabilities; and a word cloud constructed from FutureGrid project titles.

We found it possible to classify the FutureGrid projects into four major areas: Computer Science and Middleware (56%); Domain Science (21%); Training Education and Outreach (14%); and Computer Systems Evaluation (9%). The numbers in parentheses indicate percentages of total projects and illustrate the importance of computer science projects in FutureGrid’s portfolio. Looking at 200 FutureGrid projects in a two year window 10/11-11/13, there were 136 research projects (others were in education, technology evaluation and interoperability) of which 109 had a major CS component and 44 an application component with 17 of these jointly classified.

98 of the 200 projects only needed access to virtual machines (VM’s) and 54 requested both VM’s and physical nodes. Of the 48 projects not requesting VM’s, 8 studied cloud technology like Hadoop. In total, 160 projects (80%) were cloud related. 16 projects involved GPU access and 30% of all projects used MapReduce.

We identified an initial list of 25 broad cloud computing research areas needing FutureGrid-type capabilities, by pooling our experience with topics studied in these FutureGrid projects (whose numbers are shown in parentheses): Core Virtualization (17); Networking (3); Wireless (0); P2P (2); Cyber-Physical CPS and Mobile Systems (5); Real-Time (0); Storage (2); Distributed Clouds & Systems (8); Resource management (9); Security and Privacy (10); Fault-Tolerance (5); Cyberinfrastructure (11) Programming Models (12); Libraries (5); Data Systems (10); Streaming Data (2); Artificial Intelligence (7); Network/Web Science (3); Software Engineering (2); Education (42, 90% of which on computer science); Open Source Software Testbed (0); Interoperability (3); Energy & Sustainability (0)...

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page