Project Gutenberg puts 5,000 audiobooks online for free using synthetic speech

Devin Coldewey

19 September 2023 at 12:36 pm·3-min read

Open book repository Project Gutenberg has turned thousands of its titles into audiobooks practically overnight using synthetic speech, available now for download or streaming on multiple services. The selection is a bit idiosyncratic (as indeed the archive's is generally) but it is nevertheless a powerful demonstration of accessibility in literature.

Making an audiobook via traditional narration naturally takes quite a long time even in the best case, and of course the reader must be paid for their time and there is the matter of editing and publishing. For many titles it doesn't make sense financially to produce an audiobook, meaning many older and more obscure titles remain difficult for people who prefer that format to consume.

Project Gutenberg is, of course, dedicated to promulgating public domain literature in as many formats as possible, and filling this gap has likely been on their to-do list for years. But it was only when they teamed up with MIT and Microsoft that they were able to perform the kind of code magic necessary to use AI-generated speech to bring these books to life.

The problem with PG's archive, as valuable as it is, is that the files are not uniformly formatted. They come from various sources, often error-ridden optical character recognition processes, and often are imperfectly edited and corrected by volunteers. Even if they were flawless, it does not follow that the format would be easily read by a machine: you would end up narration of page numbers, footnotes, and other ephemera.

"Each one of the e-books in Project Gutenberg is in its own idiosyncratic html format with lots of text you wouldn’t want to hear read aloud like tables, contents, indices, page numbers etc. The hardest part of the project was extracting the good text to read aloud." explained project co-lead Mark Hamilton, affiliated with Microsoft and MIT.

To solve this, they designed a system that worked through the archive and identified book files that were formatted similarly, then figured out which of those clusters were the best suited to being automatically read out.

This first batch, being somewhat constrained in its selection, is a little idiosyncratic: for instance, there is only one Dickens book (the unfinished "Edwin Drood" at that) but a dozen volumes along the lines of "Notes and Queries, Number 176, March 12, 1853 A Medium of Inter-communication for Literary Men, Artists, Antiquaries, Genealogists, etc."

"We picked the books for the first batch based on what we felt the automated parser could do reasonably well," Hamilton continued. "Nevertheless, some key good ones fell through the cracks. Now that we have the first batch out, we’re working to generalize the system to get closer to the full 60k books in a future release."

As for the narration itself, the team has put together multiple machine learning and synthetic speech tools that have improved and become more accessible over the last few years. A few years ago it was obvious that automated audiobook production would soon arrive, and that is has — and at scale.

WellSaid aims to make natural-sounding synthetic speech a credible alternative to real humans

Here's how the paper on the project describes their approach to making a generated audiobook engaging:

To create an emotive reading of the text, we use an automatic speaker and emotion inference system to dynamically change the reading voice and tone based on context. This makes passages with multiple characters and emotional dialogue more life-like and engaging. To this end, we first segment the text into narration and dialogue and identify the speaker for each dialogue section. We then predict the emotion of each dialogue using in a self-supervised manner. Finally, we assign separate voices and emotions to the narrator and the character dialogues using the multi-style and contextual-based neural text- to-speech model proposed in.

The first 5,000 or so books are available to listen to for free on Spotify, Apple Podcasts, and the Internet Archive, and the code used to create them is being documented at GitHub.

Cosmo
Rosalía goes braless and *almost* frees the nip in a lace naked dress
Rosalía stepped out wearing a breathtaking naked dress at the Prelude to the Olympics in Paris. The design was a nude coloured see-through lace gown by Dior.
HuffPost
‘I Approve This Message’: Kamala Harris Instantly Uses Trump’s Own Words Against Him
That didn’t take long.
NY Daily News
Harris campaign roasts Trump as ‘old and quite weird’ after Fox News insults
Republican presidential candidate Donald Trump called in to Fox News Thursday, where he told supporters that presumptive Democratic nominee Kamala Harris is a “radical left, not very smart person” who’s part of a massive conspiracy to weaponize the nation’s legal system against him. Harris’ campaign fired back mere minutes later with an email blasting the “78-year-old convicted criminal’s Fox ...
SETHLUI.COM
Ru Yi Yuan: Rude, stingy & unhygenic auntie has hour-long queue for vegetarian bee hoon
The post Ru Yi Yuan: Rude, stingy & unhygenic auntie has hour-long queue for vegetarian bee hoon appeared first on SETHLUI.com.
Evening Standard
Arne Slot hails double new Liverpool addition with vital Premier League experience secured
New Reds boss hails latest Anfield arrivals as plans take shape
The Independent
Stranded Boeing astronauts are stuck on International Space Station, Nasa says in urgent update
The astronauts stranded on the International Space Station are still not able to come home, Nasa has said. Two astronauts went to the space station almost 50 days ago as part of a test of Boeing’s Starliner capsule. Test pilots Butch Wilmore and Suni Williams were supposed to visit the orbiting lab for about a week and return in mid-June, but thruster failures and helium leaks on Boeing‘s new Starliner capsule prompted Nasa and Boeing to keep them up longer.
Fortune
Want to get a job at Meta? It doesn’t matter what you study—as long as you can ‘do one thing really well,’ Mark Zuckerberg says
Meta CEO Mark Zuckerberg says what matters most in his hiring philosophy is people being able to do one thing really well.
People
Vanessa Williams, 61, Refuses to Get Botox, Fillers or a Facelift: ‘I Want to Look Like Myself’ (Exclusive)
The former beauty-queen-turned-Hollywood-star gets candid about what she has and hasn't done amid the aging process
The Telegraph
How Gerald Ford predicted Kamala Harris’s presidential run
Almost 35 years ago, Gerald Ford predicted that America would get its first female president only when a male incumbent could no longer continue.
The Telegraph
Manchester United staff ‘shocked, upset and angry’ as long-serving academy coaches face cull
Manchester United’s academy staff have been left “shocked”, “upset” and in some cases “angry” at the news that several respected, long-serving coaches could lose their jobs in the cost-cutting drive at Old Trafford.
Cosmo
JLo's plunging white swimsuit ticks off so many summer trends
Jennifer Lopez celebrated her 55th birthday wearing a Gooseberry Intimates plunging white one-piece. Shop her exact swimsuit plus more affordable look-a-likes.
The Independent
Police officer stood down after ‘truly shocking’ video shows man kicked in face at Manchester Airport
Hundreds of protesters chanted ‘shame on you’ at a protest at Manchester airport following the incident captured on camera
NextShark
Asian teen stomped on head during Bay Area basketball game
A police investigation is underway following a violent incident during a youth basketball game where a 13-year-old player stomped on an opponent's head, leading to a concussion. The game, held at the College of Alameda on Sunday, involved the Filipino American Tumakbo United team and Payton's Place team, both from the Bay Area. What happened: The now-viral video of the incident shows a scuffle over the ball, during which the Filipino boy falls to the ground before his 13-year-old opponent stomps on his head.
Evening Standard
Elderly woman was 'rammed with trolley' sparking Manchester airport police 'stamping' incident
Brothers confronted man who had argued with their mother on flight before pushing trolley into her, it is claimed
HuffPost
'How Dare You?': Whoopi Goldberg Drops Fiery Response To JD Vance's 'Childless' Dig
"The View" co-host went after Vance, who once likened Kamala Harris and Pete Buttigieg to "cat ladies."
SETHLUI.COM
Kiang Kiang Taiwan Teppanyaki: Ex-hotel chef from Taipei serves sizzling hotplate pasta with ribeye steak, basil pork & halibut
The post Kiang Kiang Taiwan Teppanyaki: Ex-hotel chef from Taipei serves sizzling hotplate pasta with ribeye steak, basil pork & halibut appeared first on SETHLUI.com.
INSIDER
Trump picking JD Vance was a 'really bad decision' and he would have been better off with Nikki Haley, ex-Trump official says
Anthony Scaramucci said Trump made a "really bad decision" choosing JD Vance as his VP, although Trump has said that Vance is "doing a fantastic job."
People
Was Trump Struck By a Bullet or Shrapnel? FBI Director Testifies There's 'Some Question' Around Injury
"There's some question about whether or not it's a bullet or shrapnel that hit his ear," FBI Director Christopher Wray said
Associated Press
China issues rare praise to Philippine president for his ban on Chinese online gambling operators
China issued a rare compliment to the administration of Philippine President Ferdinand Marcos Jr. Marcos accused some of venturing into crimes including financial scams, human trafficking, kidnappings, torture and murder. Relations between China and the Philippines under Marcos have been strained since he allowed an expanded U.S. military presence in the country under a 2014 defense pact and hostilities between their forces started to flare in the disputed South China Sea last year.
Simply Wall St.
Singapore Airlines And Two More Top Dividend Stocks On SGX
In recent times, the Singapore market has shown a steady, flat performance both over the past week and year, with expectations of earnings growth at an annual rate of 9.2% in the coming years. In such a stable market environment, dividend stocks like Singapore Airlines offer potential for consistent returns, making them an attractive option for investors seeking regular income streams.

Project Gutenberg puts 5,000 audiobooks online for free using synthetic speech

Latest stories

Rosalía goes braless and almost frees the nip in a lace naked dress

‘I Approve This Message’: Kamala Harris Instantly Uses Trump’s Own Words Against Him

Harris campaign roasts Trump as ‘old and quite weird’ after Fox News insults

Ru Yi Yuan: Rude, stingy & unhygenic auntie has hour-long queue for vegetarian bee hoon

Arne Slot hails double new Liverpool addition with vital Premier League experience secured

Stranded Boeing astronauts are stuck on International Space Station, Nasa says in urgent update

Want to get a job at Meta? It doesn’t matter what you study—as long as you can ‘do one thing really well,’ Mark Zuckerberg says

Vanessa Williams, 61, Refuses to Get Botox, Fillers or a Facelift: ‘I Want to Look Like Myself’ (Exclusive)

How Gerald Ford predicted Kamala Harris’s presidential run

Manchester United staff ‘shocked, upset and angry’ as long-serving academy coaches face cull

JLo's plunging white swimsuit ticks off so many summer trends

Police officer stood down after ‘truly shocking’ video shows man kicked in face at Manchester Airport

Asian teen stomped on head during Bay Area basketball game

Elderly woman was 'rammed with trolley' sparking Manchester airport police 'stamping' incident

'How Dare You?': Whoopi Goldberg Drops Fiery Response To JD Vance's 'Childless' Dig

Kiang Kiang Taiwan Teppanyaki: Ex-hotel chef from Taipei serves sizzling hotplate pasta with ribeye steak, basil pork & halibut

Trump picking JD Vance was a 'really bad decision' and he would have been better off with Nikki Haley, ex-Trump official says

Was Trump Struck By a Bullet or Shrapnel? FBI Director Testifies There's 'Some Question' Around Injury

China issues rare praise to Philippine president for his ban on Chinese online gambling operators

Singapore Airlines And Two More Top Dividend Stocks On SGX