News

ANI accuses OpenAI of copyright violation over ChatGPT training

ANI claims that its exclusive content, including text, interviews, and videos, has been exploited by OpenAI’s chatbot to generate responses, often verbatim, without consent.

Social Samosa

22 Nov 2024 17:42 IST

New Update

The Delhi High Court has summoned OpenAI, the U.S.-based artificial intelligence company, following a petition by Asian News International (ANI), which accuses the tech giant of copyright infringement. ANI alleges that OpenAI used its copyrighted material without permission to train its chatbot, ChatGPT. The multimedia news agency seeks an injunction to stop OpenAI from further utilising its content for training its large language models (LLMs).

A single-judge bench led by Justice Amit Bansal took cognisance of the matter on November 19 and issued notices to OpenAI. ANI claims that its exclusive content, including text, interviews, and videos, has been exploited by OpenAI’s chatbot to generate responses, often verbatim, without consent. The news agency is also pursuing damages for what it describes as unjust enrichment, unfair competition, and the misattribution of statements.

OpenAI’s initial response

OpenAI, during the hearing, informed the court that it had blocked ANI’s website to prevent the chatbot from accessing its content. However, the court observed the broader implications of the case and appointed an amicus curiae to assist in examining issues related to copyright in the context of evolving technological advancements. “Considering the range of issues in the present suit as well as issues arising on account of the latest technological advancements vis-à-vis copyrights of various copyright owners, this court is of the view that an amicus curiae be appointed to assist the court in this case,” Justice Bansal stated, as reported by LiveLaw.

The case has been scheduled for its next hearing in January 2025.

ANI’s allegations

ANI contends that OpenAI’s ChatGPT platform infringes on its copyright by duplicating its proprietary content, which is used to train the chatbot’s conversational abilities. The agency alleges that such practices violate the Copyright Act, 1957, which grants creators exclusive rights to reproduce, adapt, and distribute their works.

It claims that its archive, built over five decades, includes a range of proprietary content such as exclusive statements, articles, programmes, interviews, videos, and images. According to the petition, it operates under a licensing model that prohibits further syndication of its content and derives revenue from subscription arrangements, advertising, and syndication fees. The news agency argues that OpenAI’s alleged actions undermine its ability to monetise its intellectual property, thereby disrupting its business model.

The petition also outlines a specific instance where ChatGPT falsely stated that Congress leader Rahul Gandhi appeared on an ANI podcast, even providing a fabricated summary of the purported interview. The news agency alleges that such misattributions mislead the public and tarnish its credibility.

The broader copyright debate

The case raises important questions about the use of copyrighted content in training artificial intelligence models. LLMs like ChatGPT are trained on vast amounts of publicly available data, including articles, websites, and books, to generate human-like responses. However, as the Harvard Data Science Review noted in July 2024, these systems are not always accurate and may inadvertently misrepresent or fabricate information.

ANI argues that OpenAI’s storage and use of its content amount to perpetual copyright infringement, as the material remains in the chatbot’s memory indefinitely. Furthermore, the agency contends that the use of its content for paid subscription services exacerbates the violation, given that OpenAI’s premium plans directly monetise the output generated by its models.

Global precedents

This lawsuit is not an isolated incident. In 2023, The New York Times sued OpenAI for unauthorised use of its articles to train ChatGPT. The newspaper expressed concern that chatbot-generated summaries of current events could reduce traffic to its website, impacting advertising and subscription revenues. A December 2023 report in The New York Times stated, “When chatbots are asked about current events or other newsworthy topics, they can generate answers that rely on journalism by The Times. The newspaper expresses concern that readers will be satisfied with a response from a chatbot and decline to visit The Times’s website, thus reducing web traffic that can be translated into advertising and subscription revenue.”

Similar concerns have been raised by a group of 17 authors, including George R.R. Martin and John Grisham, who filed a class action lawsuit in September 2023 against OpenAI. The authors accused the company of illegally using their copyrighted works to train its models, as reported by CNN.

Microsoft also faced legal action from The New York Times for using its articles to train Bing’s AI capabilities. These cases collectively highlight a growing tension between AI companies and content creators over the ethical and legal boundaries of data usage.

Copyright and its legal framework

Under Indian law, copyright is a bundle of rights granted to creators of literary, artistic, and other intellectual works. The Copyright Act, 1957, protects these rights, including reproduction, communication to the public, adaptation, and translation. Infringement occurs when substantial portions of copyrighted material are used without permission, and violators can face legal action for damages and injunctions.

ANI’s petition references its exclusive rights under the Act, highlighting that it licenses its content to subscribers under strict agreements that exclude further syndication. The agency argues that OpenAI’s actions undermine these rights, amounting to unjust enrichment at the expense of ANI’s proprietary material.

The court’s decision in this case could set an important precedent for how AI models interact with copyrighted content, balancing technological innovation with intellectual property rights.