Building a semantic matching service for radio broadcasters
Currently, radio broadcasters at VRT use a radiomanager to prepare their programs. The preparation includes all the different segments of the show, in which order and all the content. During the show the broadcaster will use this preparation but will not always stick to the planning. This makes it difficult to know which of the prepared items are actually aired, what is said during these segments and in what order they are broadcasted. To enable personalised radio broadcasting, VRT wants to build a service that automatically analyses the radio broadcasts and matches this transcripts to the preparation.
To enable semantic matching of the preparation with the radio broadcast we set up a service which receives audio files of each radio segment. The service transcribes each audio segment using speech-to-text technology. For this speech-to-text service we build a custom dictionary based on names, song titles, etc. The dictionary contains all words which have a high probability to occur in the audio segments based on the preparations. Next we semantically match the transcript with the content of the prepared items in the radio manager. The semantic matching uses deep learning methods such as sentence embeddings. Finally, we write the enriched data in a central database to be used by all services within VRT.
The key added value in the short term is that the archives of VRT are now easily searchable by using the transcripts of all the radio segments. In the long term, this application offers new opportunities for personalised content for listeners. Imagine a fully personalised radiostation with all the content of your favourite radio presenters and news items. This application also opens up new possibilities for recommendation services by using structured data sources to train new models.