OpenAI Secretly Trained GPT-4 With More Than a Million Hours of Transcribed YouTube Videos

Victor Tangermann

April 8, 2024 at 11:29 a.m.·3 min read

Last month, the Wall Street Journal's Joanna Stern sat down with OpenAI CTO Mira Murati to discuss the company's latest text-to-video generator called Sora.

During the brief conversation, Stern asked Murati if Sora was trained on videos from YouTube, Instagram, and Facebook — resulting in a long and awkward pause.

"We used publicly available data and licensed data," Murati said.

"So, videos on YouTube?" Stern shot back.

"I'm actually not sure about that," Murati replied, following what can only be described as a grimace.

https://twitter.com/stokel/status/1768185199412625709

And as it turns out, there's a good reason why the CTO may have been uncomfortable with that question. As the New York Times reports, OpenAI secretly trained its GPT-4 large language model (LLM) with over a million hours of transcribed YouTube videos.

Sources with knowledge about conversations discussing ripping audio and transcriptions from YouTube videos told the newspaper that the transcripts were fed into GPT-4.

And it's not just OpenAI — YouTube owner Google also harvested transcripts, per the NYT's sources, to train its own AI models.

It's yet another data point illustrating how AI companies are relying on massive amounts of murky and possibly copyright-infringing data to train their models — all without ever fairly compensating the rights holders, let alone asking for their consent.

The practice has already led to a number of lawsuits, with rightsholders accusing companies including OpenAI and Microsoft of misattributing their practices to "fair use," a doctrine of US copyright law that allows limited use of copyrighted material without acquiring permission.

Even the NYT itself has filed a lawsuit against OpenAI and Microsoft, accusing them of copyright infringement.

Last week, days before the NYT published its piece, YouTube CEO Neal Mohan sent a clear message, telling Bloomberg that if OpenAI had in fact trained Sora on YouTube videos, that would be a "clear violation" of the video platform's terms of use.

Google spokesperson Matt Bryant told the NYT that YouTube prohibits any "unauthorized scraping or downloading of YouTube content."

Bryant also told The Verge that the company had already "seen unconfirmed reports" of OpenAI's activity.

To be clear, we still don't fully know the extent to which Sora and GPT-4 are connected. We do know that OpenAI isn't reinventing the wheel for its upcoming text-to-video generator, relying on a translational layer that's powered by its LLM to interpret text prompts.

Maybe the real question is whether ripping a million hours of YouTube videos without permission amounts to stealing. Copyright law in the US remains a legal gray area, especially when it comes to fair use.

Experts told the NYT that as AI companies churn through the entirety of the internet, licensing all of the content would likely be impossible.

"The data needed is so massive that even collective licensing really can’t work," Sy Damle, a lawyer who represents the venture capital firm Andreessen Horowitz, told the newspaper.

Even without securing all of the rights, AI companies could soon be facing an even stranger challenge: running out of training data entirely.

Researchers found that by 2026, there's a 90 percent chance AI companies could run out of high-quality data to feed their insatiable models. In other words, the likes of OpenAI could eventually have to resort to training their AI models on synthetic, AI-generated output — a dangerous race to the bottom that could have far more disastrous consequences than copyright-related lawsuits.

More on OpenAI: AI Companies Running Out of Training Data After Burning Through Entire Internet

CBC
Ms. Rachel's moving to Netflix. The YouTube star toddlers love is part of a revolution in kids' TV
Move over Hollywood flicks and sitcom faves, there's a new player in the world of streaming and she's bringing her trademark overalls and pink headband.Children's YouTube star Ms. Rachel has announced a new partnership with Netflix. Starting Jan. 27, the streaming giant will release a four-part series featuring the popular content creator. The first four episodes consist of a "curated compilation" of Ms. Rachel's YouTube videos. Netflix is set to release more episodes throughout the year. With o
The Daily Beast
Trump Zeroes in on Country to Dump Migrants From All Over
President Donald Trump seems to have finally found a country that won’t rebuff his plan to send it hordes of undocumented immigrants who don’t hail from that nation. The Trump administration is working on a plan with El Salvador that would allow the Latin American country to accept immigrants from the U.S. even if they are not El Salvadorian, according to CBS News. The plan would designate El Salvador as a “Safe Third Country” as part of the agreement, pushing the migrants to seek asylum there i
The Daily Beast
Another Country Has ‘First Dibs’ On Greenland Before America: Ex-Envoy
Denmark’s former representative to Greenland has claimed US President Donald Trump needs permission from a third country if he is to fulfill his pledge to take over the self-governing island. Tom Høyem, 83, who was Copenhagen’s top envoy to Greenland from 1982 to 1987, told The Sunday Times that he believes the United Kingdom has legal standing to make a claim for the arctic territory before the United States does. “If Trump tried to buy Greenland, he would have to ask London first,” he said, in
Hello!
Princess Lilibet's tumbling hair is so long on picnic with mum Meghan Markle
Prince Harry and Meghan Markle's daughter, Princess Lilibet, three, appeared in a photograph on the Duchess of Sussex's Instagram - and her tumbling hair is so long compared to the last time she was officially photographed.
People
Bill Gates Reveals 'Miserable' Divorce from Ex-Wife Melinda 'Was the Mistake I Regret the Most'
The former couple split in 2021, and Melinda exited their shared foundation in 2024
Canadian Press Videos
Trump discusses Canada during flight on Air Force One
Speaking to reporters on Air Force One, U.S. President Donald Trump made some of his most extensive comments about his recent suggestions that Canada could become part of the U.S.
Reuters
Trump directs US government to override California water policies if necessary
WASHINGTON -U.S. President Donald Trump on Sunday ordered the federal government to override the state of California's water-management practices to bolster firefighting efforts. The executive order comes two days after Trump visited the Los Angeles region, which has been devastated by a series of wildfires. Trump has falsely claimed that Democratic Governor Gavin Newsom and other officials refused to provide water from the northern part of the state to fight the fires.
GOBankingRates
5 Countries the US Imports Most From — and How That Could Change With Trump’s Tariff Plan
As of the latest published news, President Donald Trump still plans to tax imports coming in from the largest providers of goods to the United States. There are many problems with this tariff plan,...
The Daily Beast
Trump Picks Sides in Elon Musk Feud—And It’s Not With the ‘First Buddy’
President Donald Trump has praised U.K. Prime Minister Keir Starmer, striking the complete opposite tone of his billionaire ally Elon Musk, who has called for the British leader’s ouster. Speaking to reporters on board Air Force One on Saturday, Trump said the prime minister was “doing a very good job” and that the two have a “very good relationship.” “I get along with him well,” he said of Starmer, who shared a two hour dinner with the president at Trump Tower in New York City in September. “I
The Daily Beast
Trump Fires Government Watchdogs in ‘Illegal’ Midnight Massacre
President Donald Trump fired at least 15 government watchdogs in a Friday night bloodbath, marking his latest act in brazen defiance of the country’s norms and laws, according to multiple reports. The move appears to violate federal law, which requires at least 30 days’ notice to Congress before terminating an inspector general, as the watchdogs are officially known. However, Politico’s top legal reporter, Kyle Cheney, suggested on X that the White House is likely to argue that the violated prov
People
Octomom Natalie 'Nadya' Suleman Celebrates Her 8 Kids' 16th Birthday After Speaking Out in New Interview: 'So Blessed to Have You All'
The mom of 14 is celebrating her youngest kids turning 16
Associated Press
China tells Rubio to behave himself in veiled warning
China's veteran foreign minister has issued a veiled warning to America's new secretary of state: Behave yourself. Foreign Minister Wang Yi conveyed the message in a phone call Friday, their first conversation since Marco Rubio's confirmation as President Donald Trump's top diplomat four days earlier. “I hope you will act accordingly,” Wang told Rubio, according to a Foreign Ministry statement, employing a Chinese phrase typically used by a teacher or a boss warning a student or employee to behave and be responsible for their actions.
Reuters
US puts Colombia tariff, sanctions threat on hold after deportations deal
WASHINGTON/BOGOTA (Reuters) -The U.S. and Colombia pulled back from the brink of a trade war on Sunday after the White House said the South American nation had agreed to accept military aircraft carrying deported migrants. U.S. President Donald Trump had threatened tariffs and sanctions on Colombia to punish it for earlier refusing to accept military flights carrying deportees as part of his sweeping immigration crackdown. But in a statement late on Sunday, the White House said Colombia had agreed to accept the migrants after all and Washington would not impose its threatened penalties.
BBC
Actor guilty of assault the day after jail release
Former actor Jason Hoganson served half an 18-month jail sentence for assaulting his ex-partner.
The Daily Beast
‘SNL’s Michael Che Has Proof Elon Musk Is Not a ‘Nazi’
Michael Che tackled perhaps the biggest controversy of the week on Saturday Night Live’s “Weekend Update”: the salute heard around the world by Elon Musk at a post-inauguration MAGA rally. “Elon Musk was criticized for his speech at a rally after the inauguration in which he appears to give the Nazi salute,” Che said. With a picture of a Tesla Cybertruck displayed beside him, Che joked, “But come on, Elon Musk is not a Nazi. The Nazis made nice cars.”
CBC
Trump fans in border states support 'America First' — even at the expense of their northern neighbours
Minnesota retiree Joe Solmon is spending his morning browsing The Trump Store, looking for a new MAGA hat to add to his vast collection of Donald Trump-inspired clothing."I do have 14 Trump hats. I have 34 Trump T-shirts. I have seven Trump sweatshirts," he says with a grin.Business has been booming at this store in Lake Park, Minn., ever since Trump was elected U.S. president in November — and it was even busier heading into this week's inauguration events and watch parties. About a three-hour
CNN
Nearly half a century after Honolulu teen’s killing, modern DNA testing leads to arrest of a former schoolmate
Susie Chun Oakland arrived to a crime scene at McKinley High School in Honolulu that March morning nearly a half century ago.
HuffPost
British Officials Reduced To Tears Of Laughter During Trump Calls: Report
During his first term, Trump provided unwitting hilarity for eavesdroppers across the Atlantic, U.K. officials told Politico EU.
The Hockey News - Tampa Bay Lightning
Tampa Bay Lightning & New York Rangers Complete Trade
The Lightning have made a trade with the Rangers.
The Independent
‘People around me are almost universally concerned’: Bishop who begged Trump to have ‘mercy’ talks threats
Bishop Mariann Budde previously condemned Trump’s 2020 decision to clear Black Lives Matter demonstrators in Lafayette Square and then pose there for a photo-op

I bought this LifeStraw water filter bottle to prevent traveller's diarrhea — here's my honest review

OpenAI Secretly Trained GPT-4 With More Than a Million Hours of Transcribed YouTube Videos

Latest Stories

Ms. Rachel's moving to Netflix. The YouTube star toddlers love is part of a revolution in kids' TV

Trump Zeroes in on Country to Dump Migrants From All Over

Another Country Has ‘First Dibs’ On Greenland Before America: Ex-Envoy

Princess Lilibet's tumbling hair is so long on picnic with mum Meghan Markle

Bill Gates Reveals 'Miserable' Divorce from Ex-Wife Melinda 'Was the Mistake I Regret the Most'

Trump discusses Canada during flight on Air Force One

Trump directs US government to override California water policies if necessary

5 Countries the US Imports Most From — and How That Could Change With Trump’s Tariff Plan

Trump Picks Sides in Elon Musk Feud—And It’s Not With the ‘First Buddy’

Trump Fires Government Watchdogs in ‘Illegal’ Midnight Massacre

Octomom Natalie 'Nadya' Suleman Celebrates Her 8 Kids' 16th Birthday After Speaking Out in New Interview: 'So Blessed to Have You All'

China tells Rubio to behave himself in veiled warning

US puts Colombia tariff, sanctions threat on hold after deportations deal

Actor guilty of assault the day after jail release

‘SNL’s Michael Che Has Proof Elon Musk Is Not a ‘Nazi’

Trump fans in border states support 'America First' — even at the expense of their northern neighbours

Nearly half a century after Honolulu teen’s killing, modern DNA testing leads to arrest of a former schoolmate

British Officials Reduced To Tears Of Laughter During Trump Calls: Report

Tampa Bay Lightning & New York Rangers Complete Trade

‘People around me are almost universally concerned’: Bishop who begged Trump to have ‘mercy’ talks threats