Title | DHive: Query Execution Performance Analysis via Dataflow in Apache Hive |
Author | |
Corresponding Author | Tang, Bo |
Publication Years | 2023-08-01
|
DOI | |
Source Title | |
ISSN | 2150-8097
|
Volume | 16Issue:12 |
Abstract | Nowadays, Apache Hive has been widely used for large-scale data analysis applications in many organizations. Various visual analytical tools are developed to help Hive users quickly analyze the query execution process and identify the performance bottleneck of executed queries. However, existing tools mostly focus on showing the time usage of query sub-components (jobs and operators) but fail to provide enough evidence to analyze the root reasons for the slow execution progress. To tackle this problem, we develop a visual analytical system DHive to visualize and analyze the query execution progress via dataflow analysis. DHive shows the dataflow during query execution at multiple levels: query level, job level and task level, which enable users to identify the key jobs/tasks and explain their time usage by linking them to the auxiliary information such as the system configuration and hardware status. We demonstrate the effectiveness of DHive by two cases in a production cluster. DHive is open-source at https://github.com/DBGroupSUSTech/DHive.git. |
URL | [Source Record] |
Indexed By | |
Language | English
|
SUSTech Authorship | First
; Corresponding
|
Funding Project | Shenzhen Fundamental Research Program[20220815112848002]
; Guangdong Provincial Key Laboratory[2020B121201001]
|
WOS Research Area | Computer Science
|
WOS Subject | Computer Science, Information Systems
; Computer Science, Theory & Methods
|
WOS Accession No | WOS:001067701000066
|
Publisher | |
Data Source | Web of Science
|
Citation statistics | |
Document Type | Journal Article |
Identifier | http://kc.sustech.edu.cn/handle/2SGJ60CL/582919 |
Department | Department of Computer Science and Engineering |
Affiliation | 1.Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen, Peoples R China 2.Southern Univ Sci & Technol, Res Inst Trustworthy Autonomous Syst, Shenzhen, Peoples R China |
First Author Affilication | Department of Computer Science and Engineering |
Corresponding Author Affilication | Department of Computer Science and Engineering |
First Author's First Affilication | Department of Computer Science and Engineering |
Recommended Citation GB/T 7714 |
Zhang, Chaozu,Shen, Qiaomu,Tang, Bo. DHive: Query Execution Performance Analysis via Dataflow in Apache Hive[J]. PROCEEDINGS OF THE VLDB ENDOWMENT,2023,16(12).
|
APA |
Zhang, Chaozu,Shen, Qiaomu,&Tang, Bo.(2023).DHive: Query Execution Performance Analysis via Dataflow in Apache Hive.PROCEEDINGS OF THE VLDB ENDOWMENT,16(12).
|
MLA |
Zhang, Chaozu,et al."DHive: Query Execution Performance Analysis via Dataflow in Apache Hive".PROCEEDINGS OF THE VLDB ENDOWMENT 16.12(2023).
|
Files in This Item: | There are no files associated with this item. |
|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment