中文版 | English
Title

DSP: Efficient GNN Training with Multiple GPUs

Author
Corresponding AuthorYan, Xiao
DOI
Publication Years
2023-02-25
Conference Name
28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2023
ISBN
9798400700156
Source Title
Pages
392-404
Conference Date
February 25, 2023 - March 1, 2023
Conference Place
Montreal, QC, Canada
Author of Source
ACM SIGHPC; ACM SIGPLAN; HUAWEI
Publisher
Abstract
Jointly utilizing multiple GPUs to train graph neural networks (GNNs) is crucial for handling large graphs and achieving high efficiency. However, we find that existing systems suffer from high communication costs and low GPU utilization due to improper data layout and training procedures. Thus, we propose a system dubbed Distributed Sampling and Pipelining (DSP) for multi-GPU GNN training. DSP adopts a tailored data layout to utilize the fast NVLink connections among the GPUs, which stores the graph topology and popular node features in GPU memory. For efficient graph sampling with multiple GPUs, we introduce a collective sampling primitive (CSP), which pushes the sampling tasks to data to reduce communication. We also design a producer-consumer-based pipeline, which allows tasks from different mini-batches to run congruently to improve GPU utilization. We compare DSP with state-of-the-art GNN training frameworks, and the results show that DSP consistently outperforms the baselines under different datasets, GNN models and GPU counts. The speedup of DSP can be up to 26x and is over 2x in most cases.
© 2023 ACM.
SUSTech Authorship
Corresponding
Language
English
Indexed By
EI Accession Number
20231013675700
EI Keywords
Deep learning ; Digital signal processing ; Graph neural networks ; Program processors ; Topology
ESI Classification Code
Ergonomics and Human Factors Engineering:461.4 ; Semiconductor Devices and Integrated Circuits:714.2 ; Computer Circuits:721.3 ; Artificial Intelligence:723.4 ; Combinatorial Mathematics, Includes Graph Theory, Set Theory:921.4
Data Source
EV Compendex
Citation statistics
Cited Times [WOS]:0
Document TypeConference paper
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/519763
DepartmentDepartment of Computer Science and Engineering
Affiliation
1.Department of Comptuer Sicence and Engineering, The Chinese University of Hong Kong, Hong Kong
2.Department of Computer Science and Engineering, Southern University of Science and Technology, China
3.Amazon Web Services
Corresponding Author AffilicationDepartment of Computer Science and Engineering
Recommended Citation
GB/T 7714
Cai, Zhenkun,Zhou, Qihui,Yan, Xiao,et al. DSP: Efficient GNN Training with Multiple GPUs[C]//ACM SIGHPC; ACM SIGPLAN; HUAWEI:Association for Computing Machinery,2023:392-404.
Files in This Item:
There are no files associated with this item.
Related Services
Fulltext link
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Cai, Zhenkun]'s Articles
[Zhou, Qihui]'s Articles
[Yan, Xiao]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Cai, Zhenkun]'s Articles
[Zhou, Qihui]'s Articles
[Yan, Xiao]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Cai, Zhenkun]'s Articles
[Zhou, Qihui]'s Articles
[Yan, Xiao]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.