AI and speech-to-text - alugha at the NAB Show
AI and the film industry as well as online videos are now very closely linked. Even though the big ones already exist, teamwork is more important than ever.
My first stop at NAB Show 2019 was a workshop on AI and the film industry. A short but very efficient panel discussion that once again showed how important AI will be for the future when it comes to certain key issues in the film industry. In addition to Dave Cole of IPV Ltd, Steven Soenens of Skyline Communications, Stan Moote of IABM and Joe Addalia - Director of Technology Projects at Hearst Television - were there to give their opinions and answer questions.
The statements made by Addalia were very exciting. He has been dealing with the topic of AI for speech-to-text and automatic transcription for a long time. We've seen this before on YouTube. The results are sometimes really useful, but also often the complete opposite. While Google completely relies on a self-learning AI, there are other interesting approaches. Alugha is a good example. As soon as the AI has packed the spoken word into the individual segments, it is the people who do the fine-tuning and thus achieve a hit rate of almost 100% in an enormously short time. The AI learns continuously and is thus optimized.
To Addalia it is of utmost importance that you are not being selfish and only try to put yourself on the market. Google and other global companies are currently a prime example for this behavior. Everyone wants to be the best and the first, but with so many languages, dialects, words, word creations and constantly evolving new languages, this will be a difficult if not impossible task. He appeals to the makers to join forces to deliver accurate results.
Asked whether he does see AI as a job killer in this area, he says that he has not heard any complaints so far and that the opposite is true, people aren't afraid of using it as a useful tool to quickly transform the spoken word into written text.
I can only agree. If you consider that more than 500 hours of video are uploaded to YouTube every minute, then it's almost impossible to manage it manually anyway. And if you take into account the "ultra cool" factor of speech-to-text, that suddenly a search engine can understand and search a text, then the topic is at the forefront.
One of the big plus points of the NAB Show is indeed the workshops. I can only recommend everyone from the industry take a look at them. Rarely is there such concentrated knowledge in such few days.
I'm going straight to the next workshop! Stay tuned!
Until next time!