Award Abstract # 1525953
III: Small: Indexing, Querying, and Visualizing Big Spatial and Spatio-temporal Data

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: REGENTS OF THE UNIVERSITY OF MINNESOTA
Initial Amendment Date: August 27, 2015
Latest Amendment Date: August 27, 2015
Award Number: 1525953
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2015
End Date: August 31, 2020 (Estimated)
Total Intended Award Amount: $499,768.00
Total Awarded Amount to Date: $499,768.00
Funds Obligated to Date: FY 2015 = $499,768.00
History of Investigator:
  • Mohamed Mokbel (Principal Investigator)
    mokbel@umn.edu
Recipient Sponsored Research Office: University of Minnesota-Twin Cities
2221 UNIVERSITY AVE SE STE 100
MINNEAPOLIS
MN  US  55414-3074
(612)624-5599
Sponsor Congressional District: 05
Primary Place of Performance: University of Minnesota-Twin Cities
200 Union ST SE
Minneapolis
MN  US  55455-2070
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): KABJZBBJ4B54
Parent UEI:
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7364, 7923
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project conducts research, develops requisite knowledge, and builds software infrastructure to support data management for Big Spatial and Spatio-temporal Data. This is a response to the recent explosion in the amounts of spatial and temporal data produced by several devices that include smart phones, space telescopes, and medical devices. Applications using such data and in an urge need for the research of this project include studying climate data that deals with Terra bytes of monthly spatio-temporal satellite data, understanding the brain's architectural and functional principles through modeling brain neurons as spatial data, and analyzing billions of monthly geotagged social media contents for event detection and analysis. The project packages all its developed components into a full-fledged free open-source system, available to the research and developers communities in large. Besides its impact on industry, this project will have significant broader impact across multiple segments of society that include graduate and undergraduate student education by using this project software as a vehicle for their research, outreach to K-12 students through simple map visualization APIs, curriculum development through test labs inside the developed software of this project, and tutorial presentations in domestic and international conferences.

While there is an the urge need to support big spatial data, such need is hampered by the lack of specialized systems, techniques, and algorithms. Although big data is well supported with a variety of general purpose distributed systems, none of these systems provide any special support for spatial or spatio-temporal data. The only way to support big spatial and spatio-temporal data in current systems is to either treat it as non-spatial data or to write code wrappers around existing non-spatial systems. However, doing so does not take any advantage of the properties of spatial data, hence resulting in sub-par performance. This project tackles this research gap by providing a native support for spatial and spatio-temporal data inside general current big data systems. In particular, the project exploits three main research topics, namely, indexing, querying, and visualization of big spatial and spatio-temporal data. In terms of indexing, the project builds novel, generic, and scalable spatial and spatio-temporal index structures for Hadoop Distributed File System (HDFS), which is the de facto storage layer in most nowadays big data systems. In terms of querying, the project develops novel query processing techniques for range queries, nearest-neighbor queries, and spatial join, that take advantage of the spatially indexed HDFS to support various query operations on big spatial and spatio-temporal data. In terms of visualization, the project develops new scalable techniques to visualize big spatial data as single- or multi-level images. Publications, technical reports, open-source software, and experimental data from this research are disseminated via the project web site (http://www.cs.umn.edu/~mokbel/BigSpatial).

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 98)
Abdeltawab Hendawi and Mohamed Ali and Mohamed F. Mokbel "Panda*: A Generic and Scalable Framework for Predictive Spatio-temporal Queries" GeoInformatica , 2017
Abdeltawab Hendawi and Mohamed Ali and Mohamed F. Mokbel "Panda*: A Generic and Scalable Framework for Predictive Spatio-temporal Queries" GeoInformatica , 2017
Abdeltawab Hendawi and Mohamed Ali and Mohamed F. Mokbel "Panda*: A Generic and Scalable Framework for Predictive Spatio-temporal Queries" GeoInformatica , 2017
Abdeltawab Hendawi and Mohamed Ali and Mohamed F. Mokbel "Panda*: A Generic and Scalable Framework for Predictive Spatio-temporal Queries" GeoInformatica , 2017
Abdulaziz Almaslukh and Amr Magdy and Ahmed M. Aly and Mohamed F. Mokbel and Sameh Elnikety and Yuxiong He and Suman Nath and Walid G. Aref "{Local Trend Discovery on Real-time Microblogs with Uncertain Locations in Tight Memory Environments}" GeoInformatica , v.24 , 2020 , p.301-337
Ahmed Eldawy and Ibrahim Sabek and Mostafa Elganainy and Ammar Bakeer and Ahmed Abdelmotaleb and Mohamed Mokbel "Sphinx: Empowering Impala for Efficient Execution of SQL Queries on Big Spatial Data" Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, SSTD , 2017 , p.Washingto
Ahmed Eldawy and Ibrahim Sabek and Mostafa Elganainy and Ammar Bakeer and Ahmed Abdelmotaleb and Mohamed Mokbel "Sphinx: Empowering Impala for Efficient Execution of SQL Queries on Big Spatial Data" Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, SSTD , 2017 , p.Washingto
Ahmed Eldawy and Ibrahim Sabek and Mostafa Elganainy and Ammar Bakeer and Ahmed Abdelmotaleb and Mohamed Mokbel "Sphinx: Empowering Impala for Efficient Execution of SQL Queries on Big Spatial Data" Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, SSTD , 2017 , p.Washingto
Ahmed Eldawy and Ibrahim Sabek and Mostafa Elganainy and Ammar Bakeer and Ahmed Abdelmotaleb and Mohamed Mokbel "Sphinx: Empowering Impala for Efficient Execution of SQL Queries on Big Spatial Data" Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, SSTD , 2017 , p.Washingto
Ahmed Eldawy and Louai Alarabi and Mohamed F. Mokbel "Spatial Partitioning Techniques in SpatialHadoop" Proceedings of the International Conference on Very Large Data Bases, VLDB , 2015 , p.Kohala Co
Ahmed Eldawy and Louai Alarabi and Mohamed F. Mokbel "Spatial Partitioning Techniques in SpatialHadoop" Proceedings of the International Conference on Very Large Data Bases, VLDB , 2015 , p.Kohala Co
(Showing: 1 - 10 of 98)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project has produced multitude of open-source projects centered around spatial data. This includes the SpatialHadoop project as the first full-fledged open-source Map-Reduce-based system that is geared towards indexing, querying, and visualizing Big Spatial Data. SpatialHadoop was downloaded more than 80,000 times within one year of its release, and has become the de facto system when dealing with Big Spatial Data. All source code, tutorials, examples of SpatialHadoop, along with about a terra byte of freely available Big Spatial Data are hosted by the infrastructure of this project at: http://spatialhadoop.cs.umn.edu/. The project has also enabled and hosted the development of HadoopViz; a Map-Reduce-based framework for visualizing big spatial data. HadoopViz exposes an extensible interface, which allows users to define new visualization types by defining few abstract functions. HadoopViz is capable of generating big images with giga-pixel resolution by employing a three-phase technique, partition-plot-merge.


The project has also produced ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injectsspatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatiotemporal data types and operations. In the indexing layer, ST-Hadoop spatio-temporally loads and divides data across computation nodes in Hadoop Distributed File System in a way that mimics spatio-temporal index structures, which result in achieving orders of magnitude better performance than Hadoop and SpatialHadoop when dealing with spatio-temporal data and queries. In the operations layer, ST-Hadoop shipped with support for two fundamental spatio-temporal queries, namely, spatio-temporal range and join queries. Extensibility of ST-Hadoop allows others to expand features and operations easily using similar approach. The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System. All source code, tutorials, and examples of ST-Hadoop, can be downloaded from: http://st-hadoop.cs.umn.edu/


Last Modified: 11/02/2020
Modified by: Mohamed F Mokbel

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page