Britannica Sues OpenAI Over AI Training Data

OpenAI
OpenAI is advancing Artificial Intelligence. [TechGolly]

key Points:

  • Britannica and Merriam-Webster are suing OpenAI for using their content to train ChatGPT.
  • They claim OpenAI copied nearly 100,000 articles and that ChatGPT’s summaries harm Britannica’s web traffic.
  • OpenAI defends its actions, stating its models use publicly available data and are grounded in fair use.
  • Britannica seeks monetary damages and a court order to stop the alleged infringement.

Encyclopedia Britannica and its Merriam-Webster subsidiary have sued OpenAI in Manhattan federal court. They claim OpenAI improperly used their reference materials to train its artificial intelligence models.

Britannica stated in the complaint, filed on Friday, that Microsoft-backed OpenAI used its online articles and encyclopedia and dictionary entries. This material allegedly helped teach OpenAI’s main chatbot, ChatGPT, to respond to human questions. Britannica also argues that ChatGPT “cannibalized” its web traffic by providing AI-generated summaries of its content.

In response to the lawsuit, an OpenAI spokesperson commented on Monday, “Our models empower innovation, and are trained on publicly available data and grounded in fair use.” Britannica’s spokespeople and lawyers did not immediately respond to requests for comment.

This lawsuit is one of many important cases where copyright owners, including writers and news organizations, have sued tech companies. These owners claim tech companies used their copyrighted work to train AI systems without permission. Britannica itself filed a similar lawsuit against AI startup Perplexity AI last year, which is still ongoing.

AI companies typically argue that their systems use copyrighted content fairly. They say they transform the original material into something new, which falls under “fair use.”

However, Britannica’s lawsuit claims that OpenAI illegally copied almost 100,000 of its articles. These articles were used to train OpenAI’s large language models, known as GPT. The complaint highlights that ChatGPT produces “nearly identical” copies of Britannica’s encyclopedia entries, dictionary definitions, and other content. This, Britannica argues, causes users to bypass its websites.

Britannica also accused OpenAI of infringing its trademarks. It claims OpenAI suggests it has permission to use Britannica’s material and wrongly cites Britannica in incorrect AI “hallucinations.”

Britannica seeks an unspecified amount of money for damages. It also wants a court order to stop the alleged infringement of its copyrights and trademarks.

EDITORIAL TEAM
EDITORIAL TEAM
Al Mahmud Al Mamun leads the TechGolly editorial team. He served as Editor-in-Chief of a world-leading professional research Magazine. Rasel Hossain is supporting as Managing Editor. Our team is intercorporate with technologists, researchers, and technology writers. We have substantial expertise in Information Technology (IT), Artificial Intelligence (AI), and Embedded Technology.
Read More