Back to list

Detail of contribution

Auteur: Paola MERLO

Multilingual computational models for the study of language

Abstract/Résumé: Current computational linguistic work shows great interest in extending successful probabilistic modelling to multi-lingual approaches. Many classical applications and tasks in natural language processing, such as tagging (automatically assigning parts of speech to words), parsing (automatically assigning a syntactic representation to a sentence) and recovering of semantic representations, are being investigated in a multi-lingual perspective. The final goal of this line of work is to uncover cross-linguistic regularities to automatically extend new techniques to new languages, and to make use of large amounts of data, but computational modelling can interact with large-scale linguistic work at different levels. From the point of view of the theory, the properties of these computational models might shed light on some of the properties of the generative processes underlying natural language. Methodologically, computational models and machine learning techniques provide robust tools to test the predictive power of the proposed cross-linguistic generalisations. In this talk, I will demonstrate that many type-level explanations of language universals, such as Cinque's, show weaker predictive abilities than expected when formalised and tested on independent test sets. I will then illustrate a model of cross-linguistic syntactic correspondence that uses annotated data for one languages to transfer information to a new language.