INA hosted our 6th in plenary meeting in Paris in the beginning of February. Together, we hashed out the final stages of development of our multimodal solutions and the final evaluation.
We have made our open source contributions available at the MeMAD GitHub repository. If you haven’t checked them, be sure to follow the development. In addition to tools, the repository also contains for example the metadata interchange formats the MeMAD project has developed to enable the components to work together efficiently and easily.
In addition to our open source contributions, our industry partners are providing their platforms for the project use. Lingsoft is bringing speech recognition and speaker segmentation into the prototype via the Speech Service APIs and providing tools for Named Entity Recognition and linking into Wikidata via our Language Management Central API. Limecraft is bridging the gap between all of the tools by providing their Flow platform into which all of the tools and methodologies are integrated.
Multimodal solutions for richer descriptions
The components we have developed go well together with each other. Named Entity Recognition and linking helps with translation and speaker segmentation. Similarly, speech recognition from Lingsoft and audio event detection from Aalto University can be linked with facial recognition developed by EURECOM and automatic visual scene captioning developed by Aalto University, to name a few of the multimodal analysis topics. As an outcome we are able to produce richer descriptions of the video contents by combining multiple modalities and approaches together.
All of this would not be possible without our partners with data and insight on user needs: INA and Yle. INA is providing us the French data and their tool development, for example recognizing the gender of the speaker, featured in an earlier blog post. In addition to Finnish and Swedish data, Yle’s media professionals are the first ones that test our ideas.
Due to the global pandemia, our online collaboration is now more important than ever. We gather together biweekly and have video calls weekly in dedicated task force groups. In these meetings, we are able to finetune the details for the prototype and the final evaluations.