TY - GEN
T1 - Exploring the use of diverse replicas for big location tracking data
AU - Ding, Ye
AU - Tan, Haoyu
AU - Luo, Wuman
AU - Ni, Lionel M.
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/8/29
Y1 - 2014/8/29
N2 - The value of large amount of location tracking data has received wide attention in many applications including human behavior analysis, urban transportation planning, and various location-based services (LBS). Nowadays, both scientific and industrial communities are encouraged to collect as much location tracking data as possible, which brings about two issues: 1) it is challenging to process the queries on big location tracking data efficiently, and 2) it is expensive to store several exact data replicas for fault-tolerance. So far, several dedicated storage systems have been proposed to address these issues. However, they do not work well when the query ranges vary widely. In this paper, we present the design of a storage system using diverse replica scheme which improves the query processing efficiency with reduced cost of storage space. To the best of our knowledge, we are the first to investigate the data storage and processing in the context of big location tracking data. Specifically, we conduct in-depth theoretical and empirical analysis of the trade-offs between different spatio-temporal partitioning schemes as well as data encoding schemes. Then we propose an effective approach to select an appropriate set of diverse replicas, which is optimized for the expected query loads while conforming to the given storage space budget. The experiment results confirm that using diverse replicas can significantly improve the overall query performance. The results also demonstrate that the proposed algorithms for the replica selection problem is both effective and efficient.
AB - The value of large amount of location tracking data has received wide attention in many applications including human behavior analysis, urban transportation planning, and various location-based services (LBS). Nowadays, both scientific and industrial communities are encouraged to collect as much location tracking data as possible, which brings about two issues: 1) it is challenging to process the queries on big location tracking data efficiently, and 2) it is expensive to store several exact data replicas for fault-tolerance. So far, several dedicated storage systems have been proposed to address these issues. However, they do not work well when the query ranges vary widely. In this paper, we present the design of a storage system using diverse replica scheme which improves the query processing efficiency with reduced cost of storage space. To the best of our knowledge, we are the first to investigate the data storage and processing in the context of big location tracking data. Specifically, we conduct in-depth theoretical and empirical analysis of the trade-offs between different spatio-temporal partitioning schemes as well as data encoding schemes. Then we propose an effective approach to select an appropriate set of diverse replicas, which is optimized for the expected query loads while conforming to the given storage space budget. The experiment results confirm that using diverse replicas can significantly improve the overall query performance. The results also demonstrate that the proposed algorithms for the replica selection problem is both effective and efficient.
UR - http://www.scopus.com/inward/record.url?scp=84907755802&partnerID=8YFLogxK
U2 - 10.1109/ICDCS.2014.17
DO - 10.1109/ICDCS.2014.17
M3 - Conference contribution
AN - SCOPUS:84907755802
T3 - Proceedings - International Conference on Distributed Computing Systems
SP - 83
EP - 92
BT - Proceedings - International Conference on Distributed Computing Systems
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 IEEE 34th International Conference on Distributed Computing Systems, ICDCS 2014
Y2 - 30 June 2014 through 3 July 2014
ER -