CloST: A hadoop-based storage system for big spatio-temporal data analytics

Haoyu Tan, Wuman Luo, Lionel M. Ni

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

54 Citations (Scopus)

Abstract

During the past decade, various GPS-equipped devices have generated a tremendous amount of data with time and location information, which we refer to as big spatio-temporal data. In this paper, we present the design and implementation of CloST, a scalable big spatio-temporal data storage system to support data analytics using Hadoop. The main objective of CloST is to avoid scan the whole dataset when a spatio-temporal range is given. To this end, we propose a novel data model which has special treatments on three core attributes including an object id, a location and a time. Based on this data model, CloST hierarchically partitions data using all core attributes which enables efficient parallel processing of spatio-temporal range scans. According to the data characteristics, we devise a compact storage structure which reduces the storage size by an order of magnitude. In addition, we proposes scalable bulk loading algorithms capable of incrementally adding new data into the system. We conduct our experiments using a very large GPS log dataset and the results show that CloST has fast data loading speed, desirable scalability in query processing, as well as high data compression ratio.

Original languageEnglish
Title of host publicationCIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
Pages2139-2143
Number of pages5
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event21st ACM International Conference on Information and Knowledge Management, CIKM 2012 - Maui, HI, United States
Duration: 29 Oct 20122 Nov 2012

Publication series

NameACM International Conference Proceeding Series

Conference

Conference21st ACM International Conference on Information and Knowledge Management, CIKM 2012
Country/TerritoryUnited States
CityMaui, HI
Period29/10/122/11/12

Keywords

  • big data
  • spatio-temporal data
  • storage system

Fingerprint

Dive into the research topics of 'CloST: A hadoop-based storage system for big spatio-temporal data analytics'. Together they form a unique fingerprint.

Cite this