Will AI software also replace audio book speakers?     The mail    The mail

Will AI software also replace audio book speakers? The mail The mail

Player is loading

Last week, Apple unveiled a catalog of audiobooks whose narration was created using artificial intelligence software: a synthetic voice from a computer that reads the text in a surprisingly realistic and human-like manner. According to Apple, it’s “a valuable addition to professionally narrated audiobooks” that could expand audiences of novel and essay listeners.

The Digital Storytelling service is aimed at independent writers and small publishers who cannot afford to pay a professional voice actor to record the full book texts. There are also plans to work with Draft2Digital, a US self-publishing service (which allows you to publish your own book without a publisher) through which authors can “submit” their book to have a version of an artificial one commented on Intelligence.

For now, the offering is limited to titles available on Apple Books, Apple’s e-book sales portal, with a few caveats: works must be in English, fiction, or romance, while thrillers, sci-fi, and mysteries are not yet get supported. It’s also possible to select the voice to read from four options designed for a specific type of genre or atmosphere, from fiction to non-fiction.

On the Apple website you can listen to short excerpts from books read by digital voices that are quite believable and sound amazingly human. For this reason, Apple has been criticized and accused of using AI to effectively make professional storytellers obsolete and replace them with software. David Caron, who produces audio books for a Canadian publisher, told the Guardian how important the role of storytellers is “to create something very different from the printed book but to be able to add value as an art form”.

On the other hand, any large-scale introduction of text-reading software of this nature would most likely represent a major advance in the availability of audiobooks for the blind and partially sighted, as it would enable the production of many more. Along with speech synthesizers that read out digital text files with artificial voices that have reduced quality of expression compared to human voice, audio books are one of the most widely used reading tools for blind people.

– Also read: How people read who cannot see

The artificial intelligence developed by Apple is just one example of the power of “text-to-speech” technology, which allows you to create a digital voice that can read written content. All the big tech companies, from Meta to Google, have invested in the sector, as have many smaller startups. Microsoft recently unveiled VALL-E, an artificial intelligence capable of simulating a person’s voice from a clip as short as three seconds, analyzing the speaker’s characteristics and rendering it digitally. The name VALL-E pays homage to DALL-E, a successful linguistic model developed by the OpenAI company, capable of generating images from a text description.

As is often the case with Apple, the technology’s specifications are kept secret, but there are many startups and companies that have long invested in something called speech synthesis, the mechanism that allows a person’s voice to be digitally reproduced. A similar process underlies the functioning of Siri, Apple’s voice assistant, or Alexa, owned by Amazon, or the voice that TikTok makes available to creators to read the text captions of their videos.

Although there is no definitive confirmation on this, there are quite credible theories circulating about the identity of the people whose voice was used as the basis for creating these voice assistants. In the case of Alexa, for example, Brad Stone, journalist and former biographer of company founder Jeff Bezos, revealed it in his latest book, in which he explained that the original voice came from Nina Rolle, an American voice actress Narrator . As for Siri, however, it would be that of Susan Bennett (a theory never confirmed by Apple but backed “one hundred percent” by forensic investigations organized by CNN).

Speech synthesis is based on the collection and analysis of an archive of voice recordings by artificial intelligence capable of breaking them up and concatenating them to create new sounds and new words. Even today, the limits of the technique mainly concern the intonation and the tight acting skills of these synthetic voices. When it comes to Apple’s service, too, the company has preferred to concentrate on non-fiction and, above all, has avoided thrillers, adventure or romance novels, which would otherwise be told with “zombie serenity”, as Slate writes.

In addition to the quality of the product, Apple’s move has also caused debate because of the excellent results shown in recent months by services of the likes of Midjourney, ChatGPT and DALL-E itself, capable of generating images and text . The digital storytelling service also confirms Apple’s ambitions in the audio book sector, in which it occupies a prominent position thanks to the Apple Books platform. However, Amazon dominates the field, both thanks to its e-commerce site and Audible, the company’s platform specializing in audiobooks.

The competition between the two companies is also increasing because the audio book market itself is growing: global sales were just over four billion dollars in 2021, but are expected to increase to over 35 billion by 2030. In this context, Amazon and Apple present themselves not only as online bookstores, but as platforms with which it is possible to publish and monetize his works, also thanks to a much higher percentage of sales than that guaranteed to authors by traditional publishers will.

The growing success of the industry has also prompted Spotify to invest in 2021 in Findaway, an audiobook sales platform that offers similar services and with which the music streaming company has expanded its content offering. This new strategy began in 2019 when Spotify started investing in podcasts as well, acquiring production studio Gimlet Media and the rights to some hugely successful tracks. More recently, Spotify is also working on the ability to buy audiobooks directly from the app.

– Also read: How artificial intelligence shapes the world

Or rather, it tried because the application update that contained this novelty was rejected three times by the App Store, Apple’s application store, which only accepted it after imposing a series of changes on Spotify that made the purchase of made audio books more complex and less easy. The dispute is part of a long history of conflict between the two companies: through the App Store, Apple actually retains 30% of every purchase made through the apps available for iPhone and iPad, including Spotify subscriptions. For this reason, in many cases, applications force users to complete purchases by accessing the website directly from a browser.

“The truth is we’ve seen this type of attitude before at Apple. That’s why we filed a lawsuit against Apple with the European Commission four years ago,” reads a website created by Spotify to denounce Apple’s policies. Even Epic Games, a company that creates video games including Fortnite, has sued Apple for the same reason, believing it shouldn’t give up 30% of every purchase made by gamers through the mobile app. The trial is still ongoing, but the first-instance verdict last year largely upheld Apple’s sales of digital services, which include Apple Music, Apple TV, Apple Pay, gaming, and in-app purchases owned was worth $78 billion in 2022.