Language Technology: Computational, linguistic and cognitive perspectives
The research group Language Technology works in the interdependent areas of computational linguistics, natural language processing (NLP) and cognitive modelling with the long-term goal of enriching computational language models with linguistic, cultural and cognitive knowledge.
The group investigates how current language models can be enriched with linguistic information, cognitive signals such as eye-tracking traces, and visual data (combining text with images, or speech with gesture). This research attempts to ground NLP models with data outside of the textual domain and has important societal applications, for instance in mental healthcare.
Another challenge concerns the language technology gap that exists for smaller languages, particularly Danish, where we focus on methods to develop high-quality datasets and lexical-semantic resources that can encompass linguistic and cultural traits typical for the Danish language community.
We also work on adapting NLP techniques to data relevant for digital humanities, an emerging field where increasingly more material, including text in historic precursors of modern languages, is becoming digitised and therefore amenable to digital processing.
Computational Cognitive Modelling and Multimodality
In our approach to computational cognitive modelling, we focus on models of language and speech processing, and the extent to which these models make use of cognitive signals and data from the visual gestural modality, and combine them with linguistic knowledge. An application of this area we investigate is the development of tools supporting detection and monitoring of mental health conditions including psychosis, depression and cognitive decline.
Methodologies for Developing NLP Language Resources and Benchmark Data
We develop methods for compiling NLP resources and language models that encompass cultural and societal diversity – with a particular focus on Danish and other low and medium-resourced languages. We perform culture-aware evaluation of language models and provide benchmark data for Danish that span linguistic and lexical variety. We work with annotated datasets, including in particular lexical-semantic lexicons and corpora, also across languages.
Representation Learning for NLP
We contribute to explainable artificial intelligence by exploring the internal representations and learning mechanisms in computational models of natural languages. In particular, through a multilingual approach, we investigate how language models cope with linguistic diversity at various architectural and linguistic levels. Our research seeks to improve the transparency and performance of multilingual language models and ensure more robust and accurate multilingual language processing.
NLP and Digital Humanities
This area includes computational NLP models for the analysis and generation of textual data in its widest forms including poems, novels, letters, manuscripts, news articles, scientific articles and lyrics. We perform a continuous upgrade of our NLP pipelines and corpus tools, in particular for Danish, and work to ensure that appropriate methods and gold standards are compiled for evaluating them. This research opens many collaboration opportunities with researchers from the Humanities at large, and NorS in particular.
Centres
The research group is affiliated to the Center for Language Technology (CST).
Projects
Current and recent projects include:
- Measuring Modernity
- The Danish Benchmark project
- Danish Foundation Models
- MultiplEYE-DK
- XHAILe
- When Danes prayed in German
- Central Word Register for Danish (COR)
- Copco: The Copenhagen Corpus of Eye-Tracking Recordings from Natural Reading of Danish Texts
- GEstures and Head Movements in language (GEHM)
- Multimodal Child Language Acquisition
- ParlaMint: Towards Comparable Parliamentary Corpora
- METALLM: Exploring and Improving the Treatment of Metaphor in Language Language Models
- ClimCond: Conditions of change: Conditionals in climate change communications.
The group organises internal and public seminars on topics relevant for the four focus areas.
Researchers
| Name | Title | Phone | |
|---|---|---|---|
| Aguirrezabal Zabaleta, Manex | Associate Professor | +4535324829 | |
| Al-Laith, Ali Mohammed Ali | Assistant Professor | +4535326658 | |
| Basirat, Ali | Associate Professor | +4535325590 | |
| Braasch, Anna | Associate Professor Emeritus | +4535329071 | |
| Diderichsen, Philip | Special Consultant | +4535324189 | |
| Gray, Simon | Special Consultant | +4535337688 | |
| Henriksen, Lina | Research Consultant | +4535329082 | |
| Jongejan, Bart | IT Officer, FU | +4535329075 | |
| Larsen, Bolette Frydendahl | Guest Researcher | +4535320290 | |
| Maegaard, Bente | Emeritus | +4535329074 | |
| Navarretta, Costanza | Senior Researcher | +4535329079 | |
| Norman, Nathalie Carmen Hau | PhD Fellow | +4535331047 | |
| Olsen, Sussi | Academic Research Staff | +4535329064 | |
| Paggio, Patrizia | Associate Professor | +4535329072 | |
| Parola, Alberto | Assistant Professor - Tenure Track | +4535325942 | |
| Pedersen, Bolette Sandford | Professor, Deputy Head of Department | +4535329078 | |
| Schneidermann, Nina Skovgaard | Enrolled PhD Student | +4535331600 |
Affiliated researchers
- Boye, Kasper
- Conroy, Alexander
- Diderichsen, Philip
- Duncker, Dorthe
- Schachtenhaufen, Ruben