MeMAD project partners Lingsoft and University of Helsinki have received funding in the first call for pilot projects of the European Language Grid (ELG). The project funding stage was fairly competitive: Ten projects out of 110 applications were funded, and two out of these ten are tightly linked with the MeMAD project! These two pilot projects will provide NLP tools via the ELG platform for research and development. While not all of the tools are developed in the MeMAD project, this funding provides an opportunity to also make new development from the MeMAD project available. The European Language Grid site contains the full list of funded projects.
These two projects are also presented in the META Forum conference, hosted virtually from Berlin, Germany on December 1-3.
Lingsoft
In the project, Lingsoft tools for spelling/grammar, speech recognition, subtitling, named entity recognition, and machine translation are made available through the ELG GRID platform. The project will result in a set of NLP tools for the Nordic languages being available both for public organizations and companies, allowing companies and public organizations throughout Europe to efficiently incorporate Nordic language support in e.g. subtitled videos or customer service chats.
Lingsoft Text Analysis
- Lemmatization (& disambiguation): Finnish, Swedish, Norwegian bokmål and nynorsk, Danish, English
- Spelling & Grammar: Finnish, Swedish, Norwegian bokmål, Norwegian nynorsk (spelling only), Danish, English
- Named Entity Recognition: Finnish, Swedish, Norwegian bokmål, English
- Ontology relation: Finnish, Swedish, Norwegian bokmål (some support), English (some support)
Lingsoft Speech Recognition
- Audio/Video Stream: Finnish, Swedish, Norwegian bokmål, Norwegian nynorsk (on the roadmap)
- Diarisation: Finnish, Swedish
- Subtitling: Finnish, Swedish
Lingsoft Machine Translation
- Finnish-English-Finnish
- Swedish-English-Swedish
- Finnish-Swedish-Finnish
University of Helsinki
Open Translation Models, Tools and Services (OPUS-MT)
OPUS-MT will produce state-of-the-art neural machine translation models that can freely be shared, re-used and integrated in open web services and professional translation workflows. The project will focus on European minority languages and their improved support through multilingual NMT models and transfer learning. Furthermore, OPUS-MT will deliver easily deployable translation services and tools for quick domain-adaptation and on-demand personalisation.
Currently, OPUS-MT focuses on Northern Sami, which will be extended to other Sami languages using transfer learning and multilingual translation models. OPUS-MT will also look at Celtic languages including Breton, Scottish Gaelic and Welsh with possible extensions to Cornish and Manx. Furthermore, we also plan to support Galician and Romansh.
Another focus is on the integration of machine translation in the workflow of professional translators. We develop CAT solutions (OPUS-CAT) with plugins for popular translation workbenches with implementations that can run on local machines and with a modular design and support for all models we release. Those tools will also include fine-tuning functionalities to enable personalised translation models.