Project Goals

The MeMAD consortium aims to develop automatic language-based methods for managing, accessing and publishing pre-existing and originally produced Digital Content in an efficient and accurate manner within the Creative Industries, especially in TV broadcasting and in on-demand media services.

“Digital Content” contains audiovisual material along with various ‘ancillary’ texts such as captions, descriptions in different languages, and hyperlinks to related content, similar to what hypertext is to plain text. Digital content forms the central material with which the Creative Industries of today and tomorrow operate. More specifically, therefore, this project aims to develop methods and models for producing enhanced digital audiovisual information in multiple languages and for various use contexts and audiences, and to industrialise these results with demonstrable proofs of concept. To achieve these aims, the MeMAD consortium has specified the following four objectives. These objectives will be implemented by our work packages and project-wide use cases which will also serve as additional ways to measure our success in reaching the objectives and expected impacts.

Objective O1: Develop novel methods and tools for digital storytelling

The state of the art in both fiction production and factual storytelling strongly relies on human interpretation. Adding captions, descriptions and links to related content, if any, is a slow and expensive process. By combining the best available knowledge in machine processing, machine learning and human editing of verbal description, this project aims to industrialise the process of digital storytelling and re-use and re-purposing of existing media to resources by both the media producers and the media consumers. This can be achieved by combining automatic speech and audio recognition, computer vision techniques, human techniques and strategies of describing audiovisual content and machine learning, and by using language-based tools to disclose large archives of audiovisual data in an efficient and accurate manner.

Objective O2: Deliver methods and tools to expand the size of media audiences

In the Creative Industries, the size of the audience for which media assets are produced, is a determining factor for the feasibility production process. By developing tools that enable the producer to produce alternative versions of the same product at marginal cost, he is capable of expanding the size of the intended audience. For example, by making the media multilingual or equipped with additional descriptions that serve the deaf, hard-of-hearing, blind, and partially-sighted audiences, the project will both reduce the overall cost of productions and offer the media experiences to wider audiences.

Objective O3: Develop an improved scientific understanding of multimodal and multilingual media content analysis, linking and consumption

The development of novel media distribution services necessitates better understanding of the underlying human requirements and expectations for such services and of the methodological possibilities for meeting these needs. A central aim is to model how semantic knowledge can and should be extracted from digital content. In our interdisciplinary consortium, we have the necessary expertise to further the understanding of these fundamental issues.

Objective O4: Deliver object models and formal languages, distribution protocols and display tools for enriched audiovisual data

Several singular attempts have been made in the past to develop standardized object models and formal languages for describing multimodal and multilingual media data, but none of these has reached the level of industrial adoption. Similarly, the present distribution protocols enable streaming of linear content and dispatching webpages, but there is no efficient method for distributing fragments of enriched and hyperlinked audiovisual content and displaying them in a viewer or media player. In the MeMAD project, our objective is to evaluate and develop further the existing media data models and the value chain from them to the media consumers.

The overall aim of the project will be achieved by pursuing the objectives O1-4 above, which can be fulfilled and measured within the duration of the project. The objectives are linked to four Use Cases determined by the representatives of the Creative Industries participating in the project:

Use Case UC1: Content delivery services for the re-use by end-users/clients through media indexing and video description

Online media delivery platforms rely heavily on media metadata in supplying, recommending and grouping digital media to clients. This use case aims to enhance the end-user experience of such services by creating and making use of rich metadata and hyperlinking created by automated media analysis and multimodal media indexing.

As a result, users of such delivery services should be able to discover and watch media that are meaningful to them from a spectrum of starting points and interests that is significantly broader than what can be achieved by current methods of metadata creation. Users should, for example, be able to browse and discover themes, people and places from media, and parts of media containing these even when the information has not been entered by production staff or the original media product has been designed for a different purpose.

Performance can be measured by monitoring the time spent in a service and number of media starts based on recommendations and successful searches. The success in use case UC1 contributes to MeMAD objectives O1, O3 and O4.

Use Case UC2: Creation, use, re-use and re-purposing of new footage and archived content in digital media production through media indexing and video description

This use case aims to improve discoverability and re-usability of digital-born as well as pre-existing media for the purpose of crafting new stories and audiovisual concepts. Media professionals are provided with rich and relevant relationships between archive media, scripts and raw footage during different stages of digital media production, enabling them to develop a digital story and concepts with the help of automated metadata extraction and media analysis. Relevant media fragments are automatically recommended, which saves significant amounts of editorial work compared with conventional methods of research in media archives.

Performance can be measured by monitoring production costs and productivity of archive research and footage management. The success in use case UC2 contributes to MeMAD objectives O2, O3 and O4.

Use Case UC3: Improving user experience with media enrichment by linking to external resources

A video program may be edited using a complex narrative but viewers have different background and interests and may not be familiar with all the elements being presented, triggering the need to go more in depth for some aspects being presented. Video programs also trigger social media reactions (e.g. on Twitter or Facebook) where sometimes viewers clip and repurpose some original parts of the video program. One way to improve the user experience is to provide individual users the possibility to access and explore related material (e.g. videos, news articles or set of facts extracted from encyclopedia) that will contain additional information that they personally need or are interested in to better understand the narrative of the video program.

External material may be essential for understanding the audiovisual content. For example, when republishing decades old audiovisual content from the archives, to understand the meaning of the archive content, additional material may be required that gives the historical context and information on how to interpret the content.

Performance can be measured by monitoring the amount of additional content explored by users while accessing a video program. The success in use case UC3 contributes to MeMAD objectives O2, O3 and O4.

Use Case UC4: Automated subtitling/captioning and audio description. Speech and sounds to text and also visual content to text, both with multiple output languages, for general purpose use and for the deaf, hard-of-hearing, blind, and partially-sighted audiences

This use case addresses an urgent requirement to enhance as much content as possible with complementary subtitles and aural audio description. Conventionally these are created by human subtitlers and translators, and at a total production cost of 1000-1200 Euro per hour (for subtitling) up to 3000 Euro per hour (for audio description). Also, manual subtitling and audio description requires a significant cycle time from one to two weeks.

For this use case, we will undertake to maximise productivity of both subtitling (same language as well as language to language) and audio description processes, through “supervised automation”.

Performance is measured by the decreased cost per hour of processed material and decreased cycle time that result from: (1) increasing automation level and multilingualism, (2) enhancing information present in subtitles and audio description, and (3) reducing the average production costs for subtitling and audio descriptions by increasing the percentage of content provided with machine-generated subtitles and audio descriptions.

The success in use case UC4 contributes to MeMAD objectives O1 and O2.