The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer

Links

https://link.springer.com/10.1007/978-3-031-28241-6_4

DOI

https://doi.org/10.1007/978-3-031-28241-6_4
Final published version

Pavel Efimov
Leonid Boytsov
Elena Arslanova
Pavel Braslavski

Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. [8] proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a topologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks (XSR, NER, and QA) and an additional training regime (continual learning). Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded. Analysis of distances between contextualized embeddings of related and unrelated words (across languages) showed that fine-tuning leads to “forgetting” some of the cross-lingual alignment information. Based on this observation, we further improved NLI performance using continual learning. Our software is publicly available https://github.com/pefimov/cross-lingual-adjustment.

Original language	English
Title of host publication	Advances in Information Retrieval: 45th European Conference on Information Retrieval
Subtitle of host publication	book
Editors	Jaap Kamps, Lorraine Goeuriot
Publisher	Springer Cham
Chapter	Chapter 4
Pages	51-67
Number of pages	17
ISBN (Electronic)	978-3-031-28238-6
ISBN (Print)	978-3-031-28237-9
DOIs	https://doi.org/10.1007/978-3-031-28241-6_4
Publication status	Published - 16 Mar 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13982
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

ASJC Scopus subject areas

General Computer Science
Theoretical Computer Science

WoS ResearchAreas Categories

Computer Science, Information Systems
Computer Science, Software Engineering
Computer Science, Theory & Methods

ID: 37140299