中文版 | English
Title

迁移学习和元学习辅助的深度学习算法研究及在材料数据上的应用

Alternative Title
RESEARCH ON DEEP LEARNING ALGORITHMSASSISTED BY TRANSFER LEARNING ANDMETA-LEARNING AND THEIR APPLICATION INMATERIAL DATA
Author
Name pinyin
YUAN Zhongju
School number
12032679
Degree
硕士
Discipline
0856 材料与化工
Subject category of dissertation
0856 材料与化工
Supervisor
王振坤
Mentor unit
系统设计与智能制造学院
Publication Years
2022-05-10
Submission date
2022-06-26
University
南方科技大学
Place of Publication
深圳
Abstract

本文拟针对缺乏标签信息的文本数据(如材料、生物医学等领域的数据)进 行信息抽取,即从相关领域的纯论文文本信息中抽取所需信息,如材料间的关系 以及材料的性质等。目前用于解决带标签数据稀少的信息提取的方法主要为小样 本元学习方法。但使用该方法训练的模型泛化能力较差,若测试数据与训练数据 来自不同的领域,且领域间差距较大,则可能出现提取信息精确度急剧下降的情 况。现有针对跨领域信息抽取的迁移学习方法来解决此类领域间差距较大的问题, 但是这些方法大多基于领域间仍然存在一定的相似性或有部分重叠,在数据量稀 少的情况下效果不显著。目前同时应用小样本元学习和迁移学习来解决某些特定 领域数据缺乏情况的方法还没有得到广泛的研究与应用,但这将成为未来的发展 趋势。

本文使用网络爬虫技术从论文库中按照关键词提取论文。为了快速阅读所提 取论文,并得到所包含的关键信息,本文提出一种跨领域小样本学习算法来解决 材料等带标签数据匮乏领域中的关系抽取任务,完成已有文献调研。

本文首次提出一种结合表征学习和领域迁移的跨领域小样本学习算法来解决 特定领域带标签数据匮乏的问题。其中表征学习通过使得同类关系句子在隐空间 中聚得更近,不同类句子则距离更远来优化隐空间的几何结构,从而提高关系抽 取的准确率;另外,通过领域迁移的方法减小不同领域间的差距,进而使得优化后 的隐空间几何结构有助于提升算法的跨领域效果。

为了验证本文提出新算法的有效性,本文将提出的算法与现有其他算法在已 有公开数据集和本文收集的材料数据集上进行对比,结果显示本文提出的算法效 果具有一定优势。并且本文通过消融学习验证了表征学习和领域迁移方法对最终 的算法性能提高均有帮助。

此外,为了验证模型的鲁棒性,本文使用一种对原文语意没有影响的不可见 对抗式攻击方法来对模型进行测试。之后为提高模型稳定性,减小对扰动的敏感 程度,使用对抗式训练的方法进行模型来减小对抗式样本对模型的影响。

Keywords
Language
Chinese
Training classes
独立培养
Enrollment Year
2020
Year of Degree Awarded
2022-06
References List

[1] RAN F, YANG X, SHAO L. Recent progress in carbon-based nanoarchitectures for advanced supercapacitors[J]. Advanced Composites and Hybrid Materials, 2018, 1(1): 32-55.

[2] PARK K Y, CHOI J H, LEE D G. Delamination-free and high efficiency drilling of carbon fiber reinforced plastics[J]. Journal of Composite Materials, 1995, 29(15): 1988-2002.

[3] KABIR M, WANG H, LAU K, et al. Chemical treatments on plant-based natural fibre reinforced polymer composites: An overview[J]. Composites Part B: Engineering, 2012, 43(7): 2883- 2892.

[4] GU H, LIU C, ZHU J, et al. Introducing advanced composites and hybrid materials[J]. Advanced Composites and Hybrid Materials, 2018, 1(1): 1-5.

[5] ISHIDA H, KUMAR G. Molecular characterization of composite interfaces: volume 27[M]. Springer Science & Business Media, 2013.

[6] LIM S, KANG J. Chemical–gene relation extraction using recursive neural network[J]. Database, 2018, 2018.

[7] SEGURA-BEDMAR I, MARTINEZ P, DE PABLO-SÁNCHEZ C. Using a shallow linguistic kernel for drug–drug interaction extraction[J]. Journal of Biomedical Informatics, 2011, 44(5): 789-804.

[8] PAWAR S, SHARMA R, PALSHIKAR G K, et al. Cause–Effect Relation Extraction from Doc- uments in Metallurgy and Materials Science[J]. Transactions of the Indian Institute of Metals, 2019, 72(8): 2209-2217.

[9] COLE J M, et al. Auto-generated materials database of Curie and Néel temperatures via semi- supervised relationship extraction[J]. Scientific Data, 2018, 5.

[10] MYSORE S, KIM E, STRUBELL E, et al. Automatically extracting action graphs from mate- rials science synthesis procedures[A]. 2017.

[11] WANG Y, YAO Q, KWOK J T, et al. Generalizing from a few examples: A survey on few-shot learning[J]. ACM Computing Surveys (CSUR), 2020, 53(3): 1-34.

[12] WU Y, LIN Y, DONG X, et al. Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 5177-5186.

[13] GAO H, SHOU Z, ZAREIAN A, et al. Low-shot learning via covariance-preserving adversarial augmentation networks[A]. 2018.

[14] CONG X, YU B, LIU T, et al. Inductive Unsupervised Domain Adaptation for Few-Shot Clas- sification via Clustering[A]. 2020.

[15] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[A]. 2018.

[16] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C]//International Conference on Learning Representations. 2017.

[17] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning (ICML). The Journal of Machine Learning Research, 2017: 1126-1135.

[18] YOO D, FAN H, BODDETI V, et al. Efficient k-shot learning with regularized deep networks [C]//Proceedings of the AAAI Conference on Artificial Intelligence: volume 32. 2018.

[19] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning [A]. 2016.

[20] SNELL J, SWERSKY K, ZEMEL R S. Prototypical networks for few-shot learning[A]. 2017.

[21] TRIANTAFILLOU E, ZEMEL R, URTASUN R. Few-shot learning through an information retrieval lens[A]. 2017.

[22] SOARES L B, FITZGERALD N, LING J, et al. Matching the blanks: Distributional similarity for relation learning[A]. 2019.

[23] WANG Y X, GIRSHICK R, HEBERT M, et al. Low-shot learning from imaginary data[C]// Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018: 7278- 7286.

[24] YANG S, LIU L, XU M. Free lunch for few-shot learning: Distribution calibration[A]. 2021.

[25] ZHAO S, WANG G, ZHANG S, et al. Multi-source distilling domain adaptation[C]// Proceedings of the AAAI Conference on Artificial Intelligence: volume 34. 2020: 12975- 12983.

[26] NGUYEN T H, PLANK B, GRISHMAN R. Semantic representations for domain adaptation: A case study on the tree kernel-based method for relation extraction[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015: 635-644.

[27] FU L, NGUYEN T H, MIN B, et al. Domain adaptation for relation extraction with domain ad- versarial neural network[C]//Proceedings of the Eighth International Joint Conference on Nat- ural Language Processing (Volume 2: Short Papers). 2017: 425-429.

[28] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets [C]//Proceedings of the International conference on Neural Information Processing Systems (NeurIPS). 2014.

[29] SHI G, FENG C, HUANG L, et al. Genre separation network with adversarial training for cross-genre relation extraction[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 1018-1023.

[30] LI Y, MURIAS M, MAJOR S, et al. Extracting relationships by multi-domain matching[C]// Neural Information Processing Systems. 2018: 6799-6810.

[31] LIANG B, LI H, SU M, et al. Deep text classification can be fooled[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018: 4208-4215.

[32] GAO J, LANCHANTIN J, SOFFA M L, et al. Black-box generation of adversarial text se- quences to evade deep learning classifiers[C]//2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018: 50-56.

[33] EBRAHIMI J, RAO A, LOWD D, et al. Hotflip: White-box adversarial examples for text clas- sification[A]. 2017.

[34] LI J, JI S, DU T, et al. Textbugger: Generating adversarial text against real-world applications [A]. 2018.

[35] HSIEH Y L, CHENG M, JUAN D C, et al. On the robustness of self-attentive models[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 1520-1529.

[36] ZANG Y, QI F, YANG C, et al. Word-level textual adversarial attacking as combinatorial opti- mization[A]. 2019.

[37] DONG X, LUU A T, JI R, et al. Towards robustness against natural language word substitutions [A]. 2021.

[38] CHENG M, YI J, CHEN P Y, et al. Seq2sick: Evaluating the robustness of sequence-to- sequence models with adversarial examples[C]//Proceedings of the AAAI Conference on Arti- ficial Intelligence: volume 34. 2020: 3601-3608.

[39] VIJAYARAGHAVAN P, ROY D. Generating black-box adversarial examples for text classi- fiers using a deep reinforced model[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2019: 711-726.

[40] LI L, MA R, GUO Q, et al. BERT-ATTACK: Adversarial Attack against BERT Using BERT [C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020: 6193-6202.

[41] JIN D, JIN Z, ZHOU J T, et al. Is bert really robust? a strong baseline for natural language at- tack on text classification and entailment[C]//Proceedings of the AAAI Conference on Artificial Intelligence: volume 34. 2020: 8018-8025.

[42] MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification[A]. 2016.

[43] LIU X, CHENG H, HE P, et al. Adversarial training for large neural language models[A]. 2020.

[44] LIU K, LIU X, YANG A, et al. A robust adversarial training approach to machine reading comprehension[C]//Proceedings of the AAAI Conference on Artificial Intelligence: volume 34. 2020: 8392-8400.

[45] LIU H, ZHANG Y, WANG Y, et al. Joint character-level word embedding and adversarial sta- bility training to defend adversarial text[C]//Proceedings of the AAAI Conference on Artificial Intelligence: volume 34. 2020: 8384-8391.

[46] LI Y, BALDWIN T, COHN T. Towards robust and privacy-preserving text representations[A]. 2018.

[47] COAVOUX M, NARAYAN S, COHEN S B. Privacy-preserving neural representations of text [A]. 2018.

[48] BELINKOV Y, BISK Y. Synthetic and Natural Noise Both Break Neural Machine Translation [C]//International Conference on Learning Representations. 2018.

[49] JONES E, JIA R, RAGHUNATHAN A, et al. Robust Encodings: A Framework for Combating Adversarial Typos[C]//Proceedings of the 58th Annual Meeting of the Association for Compu- tational Linguistics. 2020: 2752-2765.

[50] PRUTHI D, DHINGRA B, LIPTON Z C. Combating Adversarial Misspellings with Robust Word Recognition[C]//Proceedings of the 57th Annual Meeting of the Association for Compu- tational Linguistics. 2019: 5582-5591.

[51] WANG X, JIN H, HE K. Natural Language Adversarial Attack and Defense in Word Level[Z]. 2019.

[52] MOZES M, STENETORP P, KLEINBERG B, et al. Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples[C]//Proceedings of the 16th Conference of the Euro- pean Chapter of the Association for Computational Linguistics: Main Volume. 2021: 171-186.

[53] ZHOU Y, JIANG J Y, CHANG K W, et al. Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019: 4904-4913.

[54] TAERWE L. Non-metallic (FRP) reinforcement for concrete structures: proceedings of the second international RILEM symposium[M]. CRC Press, 2004.

[55] VAUTARD F, OZCAN S, POLAND L, et al. Influence of thermal history on the mechanical properties of carbon fiber–acrylate composites cured by electron beam and thermal processes [J]. Composites Part A: Applied Science and Manufacturing, 2013, 45: 162-172.

[56] TEZCAN J, OZCAN S, GURUNG B, et al. Measurement and analytical validation of interfa- cial bond strength of PAN-fiber-reinforced carbon matrix composites[J]. Journal of Materials Science, 2008, 43(5): 1612-1618.

[57] TEKINALP H L, KUNC V, VELEZ-GARCIA G M, et al. Highly oriented carbon fiber–polymer composites via additive manufacturing[J]. Composites Science and Technology, 2014, 105: 144-150.

[58] HINE P, DAVIDSON N, DUCKETT R, et al. Measuring the fibre orientation and modelling the elastic properties of injection-moulded long-glass-fibre-reinforced nylon[J]. Composites Science and Technology, 1995, 53(2): 125-131.

[59] BIJSTERBOSCH H, GAYMANS R. Polyamide 6—long glass fiber injection moldings[J]. Poly- mer Composites, 1995, 16(5): 363-369.

[60] CHUNG D D. Science of composite materials[M]//Composite Materials. Springer, 2003: 15- 54.

[61] FU S Y, LAUKE B, MÄDER E, et al. Tensile properties of short-glass-fiber-and short-carbon- fiber-reinforced polypropylene composites[J]. Composites Part A: Applied Science and Manu- facturing, 2000, 31(10): 1117-1125.

[62] TWARDOWSKI T, GEIL P. A highly fluorinated epoxy resin. III. Behavior in composite and fiber-coating applications[J]. Journal of Applied Polymer Science, 1991, 42(6): 1721-1726.

[63] RECKER H, ALTSTÄDT V, EBERLE W, et al. Toughened thermosets for damage tolerant carbon fibre reinforced composites[J]. Sampe Journal, 1990, 26(2): 73-78.

[64] MORGAN P. Carbon fibers and their composites[M]. CRC Press, 2005.

[65] CARRILLO G, PHELAN P, BROWN W, et al. Materials-Pathway to the future; Proceedings of the Thirty-third International SAMPE Symposium and Exhibition, Anaheim, CA, Mar. 7-10, 1988[R]. Society for the Advancement of Material and Process Engineering, Covina, CA, 1988.

[66] FU X, CHUNG D D. Strain-sensing concrete improved by carbon fiber surface treatment[C]// Smart Structures and Materials 1998: Smart Systems for Bridges, Structures, and Highways: volume 3325. SPIE, 1998: 53-63.

[67] CHUNG D D, CHUNG D D. Composite materials: functional materials for modern technolo- gies[M]. Springer Science & Business Media, 2003.

[68] PANG B, LEE L. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting on Association for Com- putational Linguistics. 2005: 115-124.

[69] ZHANG X, ZHAO J, LECUN Y. Character-level convolutional networks for text classification [J]. Advances in Neural Information Processing Systems, 2015, 28: 649-657.

[70] SWAIN M C, COLE J M. ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature[J]. Journal of Chemical Information and Modeling, 2016, 56(10): 1894-1904.

[71] LECUN Y, BENGIO Y, et al. Convolutional networks for images, speech, and time series[J]. The handbook of brain theory and neural networks, 1995, 3361(10): 1995.

[72] LITTLE W A. The existence of persistent states in the brain[J]. Mathematical biosciences, 1974, 19(1-2): 101-120.

[73] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the International conference on Neural Information Processing Systems (NeurIPS). 2017: 5998-6008.

[74] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[A]. 2013.

[75] KATE R, MOONEY R. Joint entity and relation extraction using card-pyramid parsing[C]// Proceedings of the Fourteenth Conference on Computational Natural Language Learning. 2010: 203-212.

[76] RIEDEL S, YAO L, MCCALLUM A. Modeling relations and their mentions without la- beled text[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2010: 148-163.

[77] ZENG D, LIU K, LAI S, et al. Relation classification via convolutional deep neural network[C]// Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: technical papers. 2014: 2335-2344.

[78] ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.

[79] EBRAHIMI J, DOU D. Chain based RNN for relation classification[C]//Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015: 1244-1249.

[80] GENG Z, CHEN G, HAN Y, et al. Semantic relation extraction using sequential and tree- structured LSTM with attention[J]. Information Sciences, 2020, 509: 183-192.

[81] GARCIA V, BRUNA J. Few-shot learning with graph neural networks[A]. 2017.

[82] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-adversarial training of neural networks [J]. The Journal of Machine Learning Research, 2016, 17(1): 2096-2030.

[83] WANG X, HAN X, LIN Y, et al. Adversarial multi-lingual neural relation extraction[C]// Proceedings of the 27th International Conference on Computational Linguistics. 2018: 1156- 1166.

[84] GAO T, HAN X, ZHU H, et al. FewRel 2.0: Towards more challenging few-shot relation clas- sification[A]. 2019.

[85] QU M, BENGIO Y, TANG J. GMNN: Graph markov neural networks[C]//Proceedings of the International Conference on Machine Learning (ICML). 2019.

[86] BORDES A, USUNIER N, GARCIA-DURAN A, et al. Translating embeddings for modeling multi-relational data[C]//Neural Information Processing Systems. 2013: 1-9.

[87] QU M, GAO T, XHONNEUX L P, et al. Few-shot relation extraction via bayesian meta- learning on relation graphs[C]//Proceedings of the International Conference on Machine Learn- ing (ICML). 2020.

[88] DING N, WANG X, FU Y, et al. Prototypical representation learning for relation extraction[A]. 2021.

[89] CUTURI M. Sinkhorn distances: lightspeed computation of optimal transport[C]//Neural In- formation Processing Systems. 2013: 2292-2300.

[90] KINGMA D P, BA J. Adam: A Method for Stochastic Optimization[J/OL]. International Con- ference on Learning Representations, 2015. http://arxiv.org/abs/1412.6980.

[91] LI X, CHEN Y, HUANG H, et al. Electrospun carbon-based nanostructured electrodes for advanced energy storage–a review[J]. Energy Storage Materials, 2016, 5: 58-92.

[92] KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[C]//Proceedings of the International Conference on Machine Learning (ICML) deep learning workshop: volume 2. Lille, 2015.

[93] KIM J, LEE M. Robust lane detection based on convolutional neural network and random sample consensus[C]//International Conference on Neural Information Processing. Springer, 2014: 454-461.

[94] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.

[95] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of Deep Bidirectional Transform- ers for Language Understanding[C]//Proceedings of the 2019 Conference of the North Ameri- can Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186.

Academic Degree Assessment Sub committee
系统设计与智能制造学院
Domestic book classification number
TP399
Data Source
人工提交
Document TypeThesis
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/343079
DepartmentSchool of System Design and Intelligent Manufacturing
Recommended Citation
GB/T 7714
袁中菊. 迁移学习和元学习辅助的深度学习算法研究及在材料数据上的应用[D]. 深圳. 南方科技大学,2022.
Files in This Item:
File Name/Size DocType Version Access License
12032679-袁中菊-系统设计与智能(1979KB) Restricted Access--Fulltext Requests
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[袁中菊]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[袁中菊]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[袁中菊]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.