Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
OpenAI, the organization behind ChatGPT and its underlying large language models (LLMs) GPT-3.5 and GPT-4, has filed motions to dismiss in two copyright lawsuits levied against the company for using copyrighted materials in AI model training data. The plaintiffs include a pair of U.S. authors and a second group including comedian and actor Sarah Silverman.
In the filings submitted to the U.S. District Court for the Northern District of California on Monday, OpenAI requested the dismissal of five out of the six counts lodged in the lawsuits. The company defended the transformative nature of its LLM technology, underscoring the need to balance copyright protection and technological advancement. OpenAI also said that it planned to contest the remaining count of direct copyright infringement in court as a matter of law.
The motions addressed the claims asserted in the copyright lawsuits and aimed to elucidate the case’s merits. OpenAI underscored the value and potential of AI, particularly ChatGPT, in enhancing productivity, aiding in coding, and simplifying daily tasks. The company likened ChatGPT’s impact to a significant intellectual revolution, drawing parallels with the invention of the printing press.
“You can start to see the story that they’re going to tell here which is that copyright has limitations to it. It doesn’t extend to facts and ideas,” said Gregory Leighton, a privacy law specialist at law firm Polsinelli. “Even if a work is copyright and an LLM, processing it or then producing a summary of it back or something like that, that’s not a derivative work on its face.”
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
OpenAI based its defense on the fundamental facts of the LLM technology: It is a type of neural network trained on extensive text data to comprehend human language effectively and it enables users to input text prompts and receive corresponding generated content. Per the filings, OpenAI claims its products merge LLMs with parameters ensuring the accuracy, relevance, safety and utility of the produced outputs.
Balancing copyright law and technological innovation
The plaintiffs argued that ChatGPT was trained without permission using their copyrighted works. In response, OpenAI contended that this perspective overlooks the broader implications of copyright law, including fair use exceptions.
The company asserted that fair use can accommodate transformative innovations like LLMs and is aligned with the constitutional intent of copyright law to foster scientific and artistic progress.
“It’s true substantively, but there’s an interesting sleight of hand going on here,” said Leighton.
“You shouldn’t be talking about fair use in a motion to dismiss because fair use is an affirmative defense. It’s actually something that they, as the defendant, have to affirmatively plead and prove up,” he said.
OpenAI’s motion cited court cases where the fair use doctrine protected innovative uses of copyrighted materials. It called for the dismissal of secondary claims from the plaintiffs, including vicarious copyright infringement, violations of the Digital Millennium Copyright Act (DMCA), violations of California’s Unfair Competition Law (UCL), negligence and unjust enrichment. OpenAI challenged the legal validity of these claims and argued for their removal based on flawed legal reasoning.
“These were probably always the ancillary and companion claims and the main meal here is copyright infringement,” said Leighton.
Vicarious copyright infringement is applied in cases where a party is in indirect benefit of copyright infringement, committed by another person. OpenAI stated that the plaintiff’s allegations of direct infringement were not valid as a matter of law, nor did it have any “right and ability to supervise” and it did not end up having any direct financial interest.
OpenAI’s arguments in favor of dismissal
OpenAI offered refuting evidence to the plaintiffs’ various theories why it violated vicarious infringement rules, the DMCA and UCL including claims including: Every ChatGPT output is an infringing derivative work of their copyrighted books and LLM training removes the “copyright management information” from the specified works.
OpenAI contends that the plaintiffs don’t have enough evidence to claim that LLMs produce derivative works, and that if those standards are applied on a wider scale, photographers would be able to sue painters who reference their material. The evidence offered by the plaintiffs about copyright management information was contradictory and failed to show how it was purposely removed.
The company also found deficiencies in the negligence and unjust enrichment claims, saying that there was no grounds for negligence as OpenAI or its users would be engaging in intentional acts and OpenAI did not owe the plaintiffs a duty of care.
Nor, according to the filings, was there any evidence to support the claim that OpenAI held on to profits or benefits from the infringed material.
Finally, OpenAI argued that both the negligence and unjust enrichment claims state law claims are preempted by federal copyright law.
“It might take a month or six weeks, but the plaintiffs will file a response where they’ll have to say why they think these claims should stay in,” said Leighton. “That actually might be quite interesting just to get their take of where they’re going with this.”
OpenAI’s dismissal request and the path forward
OpenAI’s dismissal motion is founded on ChatGPT’s transformative nature, fair use principles and perceived legal shortcomings in the plaintiffs’ ancillary claims.
The motions provided insight into OpenAI’s overall defense of its ongoing operations as it navigates the complex intersection of copyright law and AI technology advancement.
While Leighton believes that this particular motion to dismiss may not have huge immediate effects, the stakes in the overall case remain high. In determining the extent to which large language models can be trained on copyrighted works without infringing copyright, the outcome of the lawsuits could have major implications for AI use cases if it was determined that ingesting copyrighted works always infringes copyright.
“We’re getting the first real insight into where this is really gonna go,” said Leighton. “They’re introducing these things to the judge, not because it really has anything to do with the motion to dismiss itself and what they’re trying to accomplish procedurally, but it’s the intro thematically to [OpenAI’s] side of the case here.”
As the lawsuits unfold, this legal conflict will likely define the future of copyright law and technological progress.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.