Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. [8] proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a topologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks (XSR, NER, and QA) and an additional training regime (continual learning). Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded. Analysis of distances between contextualized embeddings of related and unrelated words (across languages) showed that fine-tuning leads to “forgetting” some of the cross-lingual alignment information. Based on this observation, we further improved NLI performance using continual learning. Our software is publicly available https://github.com/pefimov/cross-lingual-adjustment.
Original languageEnglish
Title of host publicationAdvances in Information Retrieval: 45th European Conference on Information Retrieval
Subtitle of host publicationbook
EditorsJaap Kamps, Lorraine Goeuriot
PublisherSpringer Cham
ChapterChapter 4
Pages51-67
Number of pages17
ISBN (Electronic)978-3-031-28238-6
ISBN (Print)978-3-031-28237-9
DOIs
Publication statusPublished - 16 Mar 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13982
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

    ASJC Scopus subject areas

  • General Computer Science
  • Theoretical Computer Science

    WoS ResearchAreas Categories

  • Computer Science, Information Systems
  • Computer Science, Software Engineering
  • Computer Science, Theory & Methods

ID: 37140299