Abstract
Keywords
Introduction
Related works
Methodology
Experiments
Results and discussion
Conclusion
Declaration of Competing Interest
Acknowledgement
References
ABSTRACT
This paper investigates the ability of multilingual BERT (mBERT) language model to transfer syntactic knowledge cross-lingually, verifying if and to which extent syntactic dependency relationships learnt in a language are maintained in other languages. In detail, the main contributions of this paper are: (i) an analysis of the cross-lingual syntactic transfer capability of mBERT model; (ii) a detailed comparison of cross-language syntactic transfer among languages belonging to different branches of the Indo-European languages, namely English, Italian and French, which present very different syntactic constructions; (iii) a study on the transferability of a syntactic phenomenon peculiar of Italian language, namely the pronoun dropping (prodrop), also known as omissibility of the subject. To this end, a structural probe devoted to reconstruct the dependency parse tree of a sentence has been exploited, representing the input sentences with the contextual embeddings from mBERT layers. The results of the experimental assessment have shown a transfer of syntactic knowledge of the mBERT model among these languages. Moreover, the behaviour of the probe in the transition from pro-drop to non-prodrop languages and vice versa has proven to be more effective in case of languages sharing a common linguistic matrix. The possibility of transferring syntactical knowledge, especially in the case of specific phenomena, meets both a theoretical need and can have important practical implications in syntactic tasks, such as dependency parsing.