
Congress MP Shashi Tharoor on Tuesday (December 16) asked in Lok Sabha a question that has been vexing lawyers, creators and technology companies across India: is the government reviewing the Copyright Act, 1957 to address the legal challenges arising from artificial intelligence (AI)?
In response, Minister of State for Commerce and Industry Jitin Prasada confirmed that an eight-member committee, constituted by the Department for Promotion of Industry and Internal Trade (DPIIT), has finalised the first part of a working paper addressing the use of copyrighted content in AI training.
This development comes at a time when the intersection of generative AI (GenAI) and copyright law has sparked legal battles across the world, pitting the economic incentives of human creators against the innovation needs of AI developers.
Core conflict: input vs output
Generative AI models, such as OpenAI’s ChatGPT and Google’s Gemini, are “trained” on vast amounts of data scraped from the internet — books, news articles, images and music. Much of this data is protected by copyright.
The central legal question is whether using this data to train a machine constitutes copyright infringement or if it falls under exceptions like “fair dealing” in India or “fair use” in the US.
The fair dealing exception permits limited use of copyrighted material without the owner’s consent for specific purposes, such as research, criticism, review, reporting or teaching. The objective of this exception is to ensure a balance between protecting creators’ rights and promoting public access to information and ideas.
“There are only two issues when copyright is implicated with the AI business,” Prashant Reddy T, an intellectual property lawyer, told The Indian Express. “One is when companies make copies of copyrighted works to create a dataset and train the AI program. The second is when a GenAI prepares an answer for a reader that contains copyright-infringing material.”
While the output stage — where an AI platform might spit out a replica of a copyrighted work — is easier to adjudicate as copyright infringement, the input stage is legally murkier. AI companies argue that reading data to learn patterns is akin to human learning; creators argue it is industrial scale-theft.
This friction has already reached Indian courts. Last year, the news agency ANI sued OpenAI in the Delhi High Court for using its content to train ChatGPT. (The Digital News Publishers Association, which The Indian Express is a part of, has filed an intervention application in the HC in order to be impleaded and heard in the matter). The matter is pending in court.
“The court’s interpretation of the terms ‘adaptation’ and ‘reproduction’ in the Copyright Act is key in this case,” says Shehnaz Ahmed, who leads research in applied law and technology at the Vidhi Centre for Legal Policy. She notes that unlike the US, where “fair use” is broad, “the fair dealing framework under the Indian Copyright Act is purpose-specific and does not address the use of copyrighted works for commercial AI model training.”
That said, there is another school of thought according to which copyright law should not necessarily penalise the act of “learning”, whether by human or machine.
Nikhil Narendran, a partner at Trilegal, argues that training AI is similar to human learning – even though the scale is different. “Mere learning should not impact anybody’s copyright and that should not be used to prevent a new mode of communication unless the output violates someone’s copyright,” he says.
Narendran suggests the focus should be on market harm caused by the output, rather than the input. “If at all there is a direct market dilution: that is when creators should get paid,” he says, citing examples where an AI tool might generate news summaries that stop users from visiting the original news website.
To resolve this, the DPIIT Committee’s working paper, titled ‘One Nation One License One Payment’, proposes a departure from traditional copyright management.
Rejecting both a blanket exception for AI, which tech companies wanted, and a pure voluntary licensing model, as was demanded by content owners, the committee has recommended what it calls “hybrid model.”
Under this proposed framework, copyright holders will not have the option to withhold their works from AI training, and instead, a mandatory blanket license will be imposed. In exchange, AI developers must pay a “statutory remuneration” — likely a percentage of their global revenue — to a centralised collecting society, the Copyright Royalties Collective for AI Training .
The report states: “The Committee recognized that access to large volumes of data… is crucial for AI development. Long negotiations and high transaction costs can hold back innovation.”
By removing the right to say “no”, the government aims to ensure AI developers have access to data while ensuring creators get paid.
While the model attempts to strike a middle ground, legal experts are skeptical about its implementation. There are concerns about how royalties would be calculated and distributed. The proposal suggests a government-appointed committee will fix rates.
“The model proposed by the DPIIT committee, while well-intentioned, will depend on operational details,” says Ahmed. “As the committee notes, setting royalties is contestable. Most developers themselves don’t know to what extent what work has been used in the ultimate product.”
She adds that a government-backed entity and collective copyright management organisations overseeing the distribution of payments would need “clear statutory parameters, transparent processes and robust grievance-redress mechanisms” to ensure “predictability and limit the scope for litigation”.
The proposal contrasts with global trends. Jurisdictions like Singapore, Japan and the European Union have introduced Text and Data Mining exceptions within copyright law. This means that someone with lawful access to copyrighted works can use it for AI training without seeking specific permission from the copyright owner.
Question of authorship
While the current DPIIT report focuses on training data, the minister’s parliamentary response noted that “Part 2” of the paper will address the copyrightability of AI-generated works. Can an AI be an author?
Currently, the law is human-centric. “The emerging consensus is that if human input is so significant in an original output, then the human has applied their mind and should get copyright over it,” says Narendran.
However, for works generated entirely by prompts with little human intervention, the legal path is unclear. “Copyright law is meant to incentivise and protect the creativity of a human being,” says Reddy. “If a work is generated by some kind of AI, it is not to be protected.”
But the question is largely academic, since AI tools cannot apply for copyright and the companies running them are not likely to as it might risk driving away users, according to experts.