中文版 | English
Title

GHive: accelerating analytical query processing in apache hive via CPU-GPU heterogeneous computing

Author
Corresponding AuthorBo Tang
DOI
Publication Years
2022-11-07
Conference Name
Proceedings of the 13th Symposium on Cloud Computing
ISSN
9781450394147
Conference Date
November 7 - 11, 2022
Conference Place
San Francisco
Abstract

As a popular distributed data warehouse system, Apache Hive has been widely used for big data analytics in many organizations. Meanwhile, exploiting the massive parallelism of GPU to accelerate online analytical processing (OLAP) has been extensively explored in the database community. In this paper, we present GHive, which enhances CPU-based Hive via CPU-GPU heterogeneous computing. GHive is designed for the business intelligence applications and provides the same API as Hive for compatibility. To run SQL queries jointly on both CPU and GPU, GHive comes with three key techniques: (i) a novel data model gTable, which is column-based and enables efficient data movement between CPU memory and GPU memory; (ii) a GPU-based operator library Panda, which provides a complete set of SQL operators with extensively optimized GPU implementations; (iii) a hardware-aware MapReduce job placement scheme, which puts jobs judiciously on either GPU or CPU via a cost-based approach. In the experiments, we observe that GHive outperforms Hive in both query processing speed and operating expense on the Star Schema Benchmark (SSB).

SUSTech Authorship
First ; Corresponding ; Others
Data Source
人工提交
Citation statistics
Cited Times [WOS]:0
Document TypeConference paper
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/415608
DepartmentDepartment of Computer Science and Engineering
Affiliation
1.Research Inst. of Trustworthy Autonomous Systems, Southern University of Science and Technology,Department of Computer Science and Engineering, Southern University of Science and Technology
2.Aalborg University
3.The Hong Kong Polytechnic University
4.Boston University
5.Huawei Technologies Co., Ltd
First Author AffilicationDepartment of Computer Science and Engineering
Corresponding Author AffilicationDepartment of Computer Science and Engineering
First Author's First AffilicationDepartment of Computer Science and Engineering
Recommended Citation
GB/T 7714
Haotian Liu,Bo Tang,Jiashu Zhang,et al. GHive: accelerating analytical query processing in apache hive via CPU-GPU heterogeneous computing[C],2022.
Files in This Item:
File Name/Size DocType Version Access License
ghive.pdf(3081KB) Restricted Access--
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Haotian Liu]'s Articles
[Bo Tang]'s Articles
[Jiashu Zhang]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Haotian Liu]'s Articles
[Bo Tang]'s Articles
[Jiashu Zhang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Haotian Liu]'s Articles
[Bo Tang]'s Articles
[Jiashu Zhang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.