Stakeholders expressed their needs at MeMAD kick-off: “I would really have liked to know what Mikko Kurimo looks like.”

The MeMAD project launched with a vibrant discussion and exchange of ideas with its stakeholders at the beginning of 2018. We wanted to organise a kick-off event with all the central stakeholders as they play a key role in shaping the MeMAD project and its outcomes. In this post, we review our stakeholders’ wishlist about what MeMAD should aim for in the coming three years.

The atmosphere at the MeMAD kick-off seminar, hosted by the Finnish Public Broadcasting Company ‘Yle’ in mid January, was one of anticipation and excitement. Over 70 representatives, including our most relevant stakeholders, were present. In addition to presenting them with an overview of the MeMAD project, we had the privilege to listen to stimulating talks from interested parties including broadcasting, the media industry, and key user groups.

“Independent participation in the audiovisual culture”

MeMAD aims to meet the needs of the broadcasting industry, but also of the media end-users. Eija-Liisa Markkula, the chair of the Finnish NGO Cultural Service for the Visually Impaired (Kulttuuria kaikille ry), understands what it is like to be excluded from the audiovisual culture – television programs, films, and the like – but also from the everyday experience. Markkula has been blind for 20 years, and while she has visual memories and can use her hearing and other senses to extract and interpret information from the environment, it is not enough: “I would really have liked to know what Mikko Kurimo [MeMAD project leader who gave the opening talk in the seminar] looks like. Everyone else in this room knows it, except for me”, Markkula reminded us, in her talk about the meaning of visual culture. For undoubtedly, sighted audiences are constantly observing and interpreting other people’s appearance and actions. Markkula hopes that MeMAD will take the users into account when designing solutions, and in general, she is hoping for more (audio) description which encourage independent participation in the audiovisual culture.

A poster with comments on Post-it notes.

Stakeholders shared their views and wishes for example by commenting posters during the workshop session.

“The future of journalism should interest MeMAD”

Are Tverberg, metadata architect from TV2 in Norway, represented our other key stakeholder group – the broadcasting industry – and suggested developments be made in the area of metadata: in particular, the way metadata is designed and written should be conducted in the interests of all, taking into account both advancements in Artificial Intelligence and changes in the newsroom. Along the same lines, Geir Børdalen from the Norwegian NRK noted that there is an abundance of audiovisual data, but the metadata is still poor. Børdalen also brought us a practical message: rather than trying to tackle many problems and end up solving nothing well, MeMAD should concentrate on one task and create one proper solution.

As stated by several stakeholder representatives at the MeMAD kickoff seminar, personalisation is the future of video and access services. This also involves engaging the end-user in the production process, said Mike Matton from the innovation and R&D department of the Belgian broadcasting company VRT. Since speech-to-text techniques are already supported by automated subtitling workflows, the next level is to provide a similar system for image-to-text.

“Hoping to make more content accessible with partial automatisation”

Tobias Schwahn, from the German public broadcasting channel ZDF, looks forward to making more content available by the – at least partial – automatisation of metadata and access services. Apart from MeMAD’s key impact perspective, that textual describing of audiovisual content would improve accessibility for metadata users – especially journalists and researchers using media archives – the project also aims to make the content re-usable to new audiences. Description and text are at the core of television’s access services, such as audio description that renders images into words, and subtitling that makes speech and sounds visible via written text. These services make audiovisual content accessible to visually- and hearing-impaired people but are currently expensive to produce due to the human effort required.

Partial automatisation of the description process could also be designed to assist human describers and reduce the workload of professionals, like audio describers, according to Bernd Benecke, head of the audio description department at German public broadcaster, Bayerischer Rundfunk. In the light of increasing demands and obligations to produce greater volumes of audio described content, the purely manual work is becoming too expensive. The broadcasting industry would therefore benefit from the development of technological tools that assist the describers. Perhaps, then, partially automated description workflows are the solution?

What next?

All of these views from important stakeholders certainly give food for thought about the MeMAD solution. Could we design a model for automated video description workflow that would be scalable and personalisable to meet the requirements of different users? The workflow would allow those working in it to choose different styles description: a very simple, automatically created description or annotation of the most salient visual aspects, to a refined, more interpretative description that is perhaps post-edited by a human describer.

MeMAD consortium members

MeMAD consortium members with Mikko Kurimo second from right in the front row.

And as we are not there yet, let us see what Eija-Liisa missed because of the lack of sight and description: Mikko Kurimo is approximately 1.75 m tall, of slim build, and wore black trousers and a black shirt. These are undoubtedly important aspects which a sight-impaired audience might want to share. But perhaps what Eija-Liisa would have like to know above all else, was that when she spoke, Mikko had a kind look in his eyes.