News That Matters

‘Digital Narration Technology’ To Deliver Audio For Millions Of Books

Apple quietly launched digital narration technology today that uses artificial intelligence to generate human-sounding narration for books. While it sounds like a dangerously bad idea at first — how will AI know what to emphasize, where to get excited, and where to slow down — the small samples Apple has shared sound surprisingly human.

The initial target: long-tail books that will never be worth paying a human narrator for.

“More and more book lovers are listening to audiobooks, yet only a fraction of books are converted to audio — leaving millions of titles unheard,” Apple says. “Many authors — especially independent authors and those associated with small publishers — aren’t able to create audiobooks due to the cost and complexity of production.”

Apple is releasing four voices to start, two female and two male. Voices are optimized for specific genres of books, so Jackson is intended for fiction or romance with a deep, somewhat husky voice, while Helen is a soprano designed for nonfiction and self-development.

“Mitchell” and “Madison” round out Apple’s initial four voices.

It’s yet another example of generative AI, which is exploding today thanks to OpenAI’s ChatGPT and many other startups and projects, including Dall-E, Midjourney, and others. ChatGPT is already banned in New York schools over cheating concerns, but the industry as a whole is expected to grow from almost nothing to over $110 billion in revenue by 2030.

At issue, of course, is human jobs in art and design, copyright on training images and paintings, and now, with AI narrators, human jobs in audiobook creation.

But there are some jobs that AI creates, as well.

“Apple Books digital narration brings together advanced speech synthesis technology with important work by teams of linguists, quality control specialists, and audio engineers to produce high-quality audiobooks from an ebook file,” Apple says. “Apple has long been on the forefront of innovative speech technology, and has now adapted it for long-form reading, working alongside publishers, authors, and narrators.”

All of the four initial Apple voices are somewhat default non-accented American-sounding voices, with slightly differentiated intonation that suggest small variations in ethnic background. While Apple hasn’t said anything about future voices, it’s likely the company will expand the program if it finds success to include other national accents such as English or Australian, and perhaps regional or ethnic voices such as American south or ebonics, or even traditional Boston or New York accents.

Of course, English is just the beginning: Spanish, French, German, and other languages await similar capability.

Apple won’t just be applying the AI voices to every title in its library. There’s actually a long process to engage in, starting with signing up with a preferred partner who will manage the process, picking your title, selecting a voice, choosing cover art, and then waiting one to two months to process the book and conduct quality checks.

Publishing is not guaranteed, Apple says: the narrated book must meet Apple’s quality and content standards.

According to The Guardian’s reporting on this, Apple will cover costs, however.

Just a few months ago Spotify, which has a significant business in audiobooks and podcasts in addition to its core music offerings, complained that Apple was engaging in ‘anticompetitive behavior’ regarding audiobook purchases on the Spotify app on iPhones. Spotify will be watching these developments closely, as will the Amazon-owned titan of the audiobook market, Audible.

Early returns sound good, but it’s important to remember that Apple is only sharing small snippets. It’ll be important to see how entire books turn out.

Source link