中文版 | English
Title

Variable Selection for Distributed Sparse Regression Under Memory Constraints

Author
Corresponding AuthorJiang, Xuejun
Publication Years
2023-02-01
DOI
Source Title
ISSN
2194-6701
EISSN
2194-671X
Abstract
This paper studies variable selection using the penalized likelihood method for distributed sparse regression with large sample size n under a limited memory constraint. This is a much needed research problem to be solved in the big data era. A naive divide-and-conquer method solving this problem is to split the whole data into N parts and run each part on one of N machines, aggregate the results from all machines via averaging, and finally obtain the selected variables. However, it tends to select more noise variables, and the false discovery rate may not be well controlled. We improve it by a special designed weighted average in aggregation. Although the alternating direction method of multiplier can be used to deal with massive data in the literature, our proposed method reduces the computational burden a lot and performs better by mean square error in most cases. Theoretically, we establish asymptotic properties of the resulting estimators for the likelihood models with a diverging number of parameters. Under some regularity conditions, we establish oracle properties in the sense that our distributed estimator shares the same asymptotic efficiency as the estimator based on the full sample. Computationally, a distributed penalized likelihood algorithm is proposed to refine the results in the context of general likelihoods. Furthermore, the proposed method is evaluated by simulations and a real example.
Keywords
URL[Source Record]
Indexed By
Language
English
SUSTech Authorship
Corresponding
Funding Project
NSFC[11871263] ; NSF grant of Guangdong Province of China[2017A030313012]
WOS Research Area
Mathematics
WOS Subject
Mathematics
WOS Accession No
WOS:000921784400001
Publisher
Data Source
Web of Science
Citation statistics
Cited Times [WOS]:0
Document TypeJournal Article
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/475022
DepartmentDepartment of Statistics and Data Science
Affiliation
1.Harbin Inst Technol, Dept Math, Harbin, Peoples R China
2.Southern Univ Sci & Technol, Dept Stat & Data Sci, Shenzhen, Peoples R China
3.Hong Kong Baptist Univ United Int Coll, Beijing Normal Univ, Zhuhai, Peoples R China
4.Univ North Carolina Charlotte, Dept Math & Stat, Charlotte, NC USA
First Author AffilicationDepartment of Statistics and Data Science
Corresponding Author AffilicationDepartment of Statistics and Data Science
Recommended Citation
GB/T 7714
Wang, Haofeng,Jiang, Xuejun,Zhou, Min,et al. Variable Selection for Distributed Sparse Regression Under Memory Constraints[J]. Communications in Mathematics and Statistics,2023.
APA
Wang, Haofeng,Jiang, Xuejun,Zhou, Min,&Jiang, Jiancheng.(2023).Variable Selection for Distributed Sparse Regression Under Memory Constraints.Communications in Mathematics and Statistics.
MLA
Wang, Haofeng,et al."Variable Selection for Distributed Sparse Regression Under Memory Constraints".Communications in Mathematics and Statistics (2023).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[Wang, Haofeng]'s Articles
[Jiang, Xuejun]'s Articles
[Zhou, Min]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[Wang, Haofeng]'s Articles
[Jiang, Xuejun]'s Articles
[Zhou, Min]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Haofeng]'s Articles
[Jiang, Xuejun]'s Articles
[Zhou, Min]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.