NSF Award Search: Award # 1241838 - WBDB2012: Workshop on Big Data Benchmarking 2012

Award Abstract # 1241838

WBDB2012: Workshop on Big Data Benchmarking 2012

NSF Org:	IIS Division of Information & Intelligent Systems
Recipient:	UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date:	May 30, 2012
Latest Amendment Date:	May 30, 2012
Award Number:	1241838
Award Instrument:	Standard Grant
Program Manager:	Sylvia Spengler sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	June 1, 2012
End Date:	May 31, 2013 (Estimated)
Total Intended Award Amount:	$15,000.00
Total Awarded Amount to Date:	$15,000.00
Funds Obligated to Date:	FY 2012 = $15,000.00
History of Investigator:	Chaitanya Baru (Principal Investigator) baru@sdsc.edu
Recipient Sponsored Research Office:	University of California-San Diego 9500 GILMAN DR LA JOLLA CA US 92093-0021 (858)534-4896
Sponsor Congressional District:	50
Primary Place of Performance:	University of California-San Diego 9500 Gilman Drive La Jolla CA US 92093-0505
Primary Place of Performance Congressional District:	50
Unique Entity Identifier (UEI):	UYTTZT6G9DT1
Parent UEI:
NSF Program(s):	Info Integration & Informatics
Primary Program Source:	01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7364, 7484, 7556
Program Element Code(s):	736400
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

With the exponential increase in the size, complexity, and rate of acquisition of diverse types of data, there an urgent need for new techniques for managing and analyzing such data. In this context, there is a critical need for benchmarks to facilitate evaluation of alternative solutions and provide for comparisons among different solution approaches targeted to big data applications. Benchmarks need to capture a variety of characteristics of big data storage, management, and analytics including new feature sets, enormous data size, largescale and evolving system configurations, shifting loads, and the heterogeneous technologies of big-data and cloud platforms. The benchmarks are inadequate for assessing emerging big data platforms, systems and in software such as SQL, NoSQL, and the Hadoop software ecosystem; different modalities or genres of big data, including graphs, streams, scientific data, document collections, and transaction data; new options in hardware including, HDD vs SSD, different types of HDD, SSD, and main memory, and large-memory systems; and, new platform options that include dedicated commodity clusters and cloud platforms.

The Workshop on Big Data Benchmarking 2012 represents an important step towards the development of a suite of benchmarks for providing objective measures of the effectiveness of hardware and software systems dealing with big data applications. The objective of this invitation-only workshop is to identify key issues and to launch an activity around the definition of reference benchmarks that can capture the essence of big data application scenarios. The effort aims to arrive at a set of objective measures and benchmark datasets to characterize and compare the performance of and the price/performance tradeoffs of alternative solutions for big data storage, retrieval, processing, and analysis problems. The workshop brings together a group of about 40 experts from academia and industry with backgrounds in big data, database systems, benchmarking and system performance, cloud storage and computing, and related areas.The industries represented range from hardware, software, analytics, and applications. The group will develop a draft of a report describing a big data benchmark suite that will be widely disseminated on the web and through presentations and oureach activities at the relevant conferences and workshops.

Broader Impacts: The availability of the big data benchmark suite will facilitate research and technological advances by providing objective measures for comparing alternative solutions to key big data problems.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Chaitanya Baru, Milind Bhandarkar, Raghunath Nambiar, Meikel Poess, Tilmann Rabl, , Vol.1, No.1, 60-64, Anne Liebert Publications. "Big Data Benchmarking and the Big Data Top100 List" Big Data Journal , v.1 , 2013 , p.60-64 http://online.liebertpub.com/toc/big/1/1

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The First Workshop on Big Data Benchmarking (WBDB2012), held on May 8-9, 2012 in San Jose, CA, served as an incubator for several promising approaches to define a big data benchmark standard for industry. The meeting was attended by about 60 participants representing about 45 different organizations from industry and academia. Through an open forum for discussions on a number of issues related to big data benchmarking—including definitions of big data terms, benchmark processes and auditing —attendees were able to extend their own view of big data benchmarking as well as communicate their own ideas, which ultimately lead to the formation of small working groups to continue collaborative work in this area. Workshop attendees were selected based on their experience and expertise in the areas of management of big data, database systems, performance benchmarking, and big data applications. There was consensus among participants about both the need and the opportunity for defining benchmarks to capture the end-to-end aspects of big data applications. It was felt that big data benchmarks should also follow the model adopted by industry's existing Transaction Processing Performance Council (TPC) benchmarks, and not only include metrics for performance but also for price/performance, along with a sound foundation for fair comparison through audit mechanisms. Additionally, the benchmarks should consider several costs relevant to big data systems including total cost of acquisition, setup cost, and the total cost of ownership, including energy cost. The first WBDB workshop has been followed by a second workshop held in December 2012 in Pune, India, and a third workshop held in July 2013 in Xi’an, China.

Last Modified: 08/01/2013
Modified by: Chaitanya K Baru

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error