AI and Copyright Law: U.S. Framework for Training, Copyrightability, and Digital Replicas

This article provides an informational analysis of copyright law as it relates to artificial intelligence as of August 2025. While primarily examining United States law, including recent federal court decisions and U.S. Copyright Office guidance, it also addresses relevant international developments. This analysis does not constitute legal advice. For specific legal questions, consult qualified counsel.
Executive Summary
Many large-scale AI models are trained on datasets that include copyrighted works, raising fundamental legal questions about when such use requires authorization, how creators should be compensated, and what rights exist in AI-generated outputs. This article examines the current legal framework governing these issues, drawing from recent judicial decisions, U.S. Copyright Office guidance, and emerging international regulatory approaches.
The Copyright Office confirms that using copyrighted works to train AI models involves reproduction requiring permission or fair use. Purely AI-generated works cannot receive copyright protection because they lack human authorship.
The Copyright Office released its comprehensive Part 3 report on Generative AI Training in May 2025, providing authoritative guidance on how copyright law applies to AI development. The Office confirms that using copyrighted works to train generative AI models involves multiple acts implicating copyright owners’ exclusive rights, including reproduction during data collection, curation, and training. The central question becomes whether these uses qualify as fair use under existing law. U.S. COPYRIGHT OFFICE, COPYRIGHT AND ARTIFICIAL INTELLIGENCE, PART 3: GENERATIVE AI TRAINING 26-31 (2025) [hereinafter 2025 AI TRAINING REPORT].
The Office has established clear boundaries regarding AI-generated content in its 2025 reports. Purely AI-generated works cannot receive copyright protection because they lack human authorship, a bedrock requirement that the D.C. Circuit affirmed in Thaler v. Perlmutter, 130 F.4th 1039, 1044-45 (D.C. Cir. 2025). Consistent with the U.S. Copyright Office’s January 2025 report, prompts alone do not provide sufficient human control over the expressive elements of the output to establish authorship; protection may extend to human-authored elements that are perceptible, to sufficient modifications, or to creative selection and arrangement. U.S. COPYRIGHT OFFICE, COPYRIGHT AND ARTIFICIAL INTELLIGENCE, PART 2: COPYRIGHTABILITY 7-10, 16-21, 37-38 (2025) [hereinafter 2025 AI COPYRIGHTABILITY REPORT].
Additionally, the Copyright Office’s July 2024 report on Digital Replicas addresses the urgent need for federal legislation to protect individuals from unauthorized digital replicas. The Office recommends that Congress establish a federal right protecting all individuals during their lifetimes from the knowing distribution of unauthorized digital replicas, addressing gaps in existing state and federal laws that have become increasingly apparent with the proliferation of deepfake technology. U.S. COPYRIGHT OFFICE, COPYRIGHT AND ARTIFICIAL INTELLIGENCE, PART 1: DIGITAL REPLICAS 57 (2024) [hereinafter 2024 DIGITAL REPLICAS REPORT].
Recent court decisions have begun to shape the fair use landscape for AI training. In Bartz v. Anthropic PBC, Judge Alsup granted in part and denied in part Anthropic’s motion for summary judgment, holding that the company’s training use was “exceedingly transformative” under the first fair use factor but declining to dismiss claims regarding allegedly “pirated” library copies. Bartz v. Anthropic PBC, No. C 24-05417 WHA, 2025 WL 1741691, at *9, *31 (N.D. Cal. June 23, 2025). Two days later, in Kadrey v. Meta Platforms, Inc., Judge Chhabria granted summary judgment to Meta on fair use grounds because the plaintiffs failed to present meaningful evidence of market harm, providing an extensive discussion of market harm theories. Kadrey v. Meta Platforms, Inc., No. 23-cv-03417-VC, 2025 WL 1752484, at *1, *34–*36 (N.D. Cal. June 25, 2025). Although ruling for Meta on that basis, the court acknowledged that “although the devil is in the details, in most cases the answer will likely be yes” when asked whether using copyrighted materials to train AI without permission is illegal. Id. at *1. Two days later, the court in Kadrey issued a separate order on the plaintiffs’ DMCA claim under 17 U.S.C. § 1202(b), holding that the removal of copyright management information cannot “induce, enable, facilitate, or conceal” infringement when the underlying use constitutes fair use. Kadrey v. Meta Platforms, Inc., No. 23-cv-03417-VC, 2025 WL 1786418, at *1–*2 (N.D. Cal. June 27, 2025). Taken together, these decisions demonstrate that fair use determinations in the AI training context remain highly fact-specific, with technical safeguards, the provenance of training data, and concrete evidence of market harm playing crucial roles in the analysis.
The licensing landscape for AI training is evolving rapidly but unevenly across different sectors and types of content. Documented deals demonstrate the emergence of sizable voluntary markets, including News Corp-OpenAI valued at approximately $250 million over five years, Shutterstock’s $104 million in AI licensing revenue in 2023, Taylor & Francis-Microsoft involving $10 million upfront plus recurring fees, and Wiley’s March 7, 2024 disclosure of $23 million in AI licensing revenue. Major publishers, news organizations, and stock photography companies have established licensing programs generating hundreds of millions in revenue, demonstrating that voluntary market solutions are emerging for certain categories of high-value, easily identifiable content. However, significant challenges remain for other types of works, particularly those created outside professional creative industries or where ownership is diffuse. The Copyright Office recommends allowing these voluntary markets to continue developing without government intervention at this time, though targeted solutions such as extended collective licensing may merit consideration if specific market failures become apparent. 2025 AI TRAINING REPORT, supra, at 103-06.
International approaches vary significantly, creating compliance challenges for companies operating across borders. The AI Act entered into force on August 1, 2024, with General-Purpose AI obligations under Article 53 applying from August 2, 2025, including requirements for a copyright-compliance policy that honors DSM Article 4(3) opt-outs and publication of a sufficiently detailed summary of training content. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence, art. 53, 2024 O.J. (L 1689) 1. Japan’s Article 30-4 permits data analysis, but uses aimed at enabling enjoyment of the expression—including fine-tuning intended to output a work and certain RAG configurations—are excluded. Agency for Cultural Affairs, General Understanding on AI and Copyright in Japan 12-15 (May 2024). As of August 2025, the UK has consulted on an EU-style TDM exception with opt-out but has not legislated; the Data (Use and Access) Act 2025 addresses data access and processing, not copyright exceptions for AI training. These divergent approaches raise questions about international harmonization and treaty compliance that will require ongoing attention as the technology and its regulation continue to evolve.
Constitutional Foundations of Copyright in the AI Era
The United States Constitution provides the fundamental framework for understanding copyright’s role in the age of artificial intelligence. Article I, Section 8, Clause 8 grants Congress the power “To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” U.S. CONST. art. I, § 8, cl. 8. This constitutional provision, known as the Intellectual Property Clause or Copyright Clause, establishes both the purpose and limitations of copyright protection.
The Supreme Court has consistently interpreted this clause as creating a utilitarian bargain rather than a natural right. In Feist Publications, Inc. v. Rural Telephone Service Co., the Court explained that the “primary objective of copyright is not to reward the labor of authors, but to promote the Progress of Science and useful Arts.” 499 U.S. 340, 349 (1991). This constitutional purpose takes on new significance in the AI context, where courts must balance incentivizing human creativity against enabling technological innovation.
The constitutional text imposes critical limitations that shape AI-related copyright issues. First, the reference to “Authors” has been interpreted to require human authorship. The D.C. Circuit’s 2025 decision in Thaler v. Perlmutter, 130 F.4th 1039, 1044-45 (D.C. Cir. 2025), relied on this constitutional foundation in affirming that AI systems cannot be authors under copyright law. The court emphasized that the Copyright Act’s human authorship requirement flows from the Constitution’s use of “Authors,” a term that has consistently been understood to refer to human beings.
Second, the “limited Times” provision ensures that works eventually enter the public domain, creating a reservoir of material that can be freely used for training AI systems. This temporal limitation reflects the Framers’ understanding that copyright should balance private incentives with public access to knowledge and culture. As the Supreme Court noted in Eldred v. Ashcroft, copyright’s limited duration serves the constitutional purpose by ensuring that creative works ultimately become “part of the public domain, free for all to use.” 537 U.S. 186, 219 (2003).
The originality requirement, though not explicit in the constitutional text, has been deemed constitutionally mandated. The Supreme Court in Feist held that originality is “constitutionally mandated for all works” and requires both independent creation and a minimal degree of creativity. 499 U.S. at 351. This constitutional floor for copyright protection becomes particularly relevant when evaluating AI-generated content, which may appear creative but lacks the human origin that the Constitution requires.
The constitutional framework also informs the fair use doctrine’s application to AI training. Judge Chhabria’s opinion in Kadrey v. Meta explicitly invoked constitutional purposes, noting that copyright law’s primary concern is “preserving the incentive for human beings to create artistic and scientific works.” This constitutional perspective suggests that uses of copyrighted works that undermine these incentives—such as training AI to flood the market with competing works—may be particularly disfavored under fair use analysis.
The Copyright Office’s 2025 reports repeatedly return to these constitutional foundations. The Office emphasizes that extending copyright protection to AI-generated works would not serve the Constitution’s purpose of incentivizing human creativity, as “machines do not need incentives to create.” 2025 AI COPYRIGHTABILITY REPORT, supra, at 35. Similarly, the Office’s analysis of fair use for AI training considers whether such uses promote or hinder “the Progress of Science and useful Arts,” recognizing that this constitutional purpose must guide the interpretation of statutory provisions.
These constitutional principles create a framework that both enables and constrains AI development. While the Constitution’s promotion of progress supports technological innovation, including AI advancement, it does so through a system designed to incentivize human creativity. Courts and policymakers must navigate this tension, ensuring that AI development does not undermine the constitutional bargain that has successfully promoted American creative and scientific leadership for over two centuries.
Understanding the Technology
How AI Training Works
The Copyright Office’s May 2025 report provides authoritative technical background on how generative AI systems are developed. Machine learning, a field of artificial intelligence, focuses on designing computer systems that can automatically learn and improve based on data or experience, without relying on explicitly programmed rules. The basic technique involves creating a statistical model using training examples along with a metric of how well the model performs. 2025 AI TRAINING REPORT, supra, at 4-5.
Generative AI systems learn by analyzing patterns in training data. To build a language model, developers feed the system billions of text examples. The system learns to predict what words typically follow other words. After training on enough examples, it can generate new text that follows similar patterns. This process requires three main steps. In data collection, developers gather training materials from various sources—scraped websites, downloaded databases, licensed content from publishers, or pirated collections. A single model might train on millions of books, billions of web pages, and countless images. During processing and training, developers clean and organize this data, then use it to train neural networks. The training process involves showing examples to the model thousands of times while adjusting its parameters to improve predictions. This creates a statistical model encoded in billions of numerical weights. In deployment, companies deploy trained models in products that serve various purposes. ChatGPT answers questions and writes text. Midjourney creates images from text descriptions. Some systems retrieve additional copyrighted content during operation to enhance their responses.
Generative AI specifically relies on neural networks—mathematical functions that map input data to output data through large collections of numbers called parameters, which define the mapping of inputs to outputs. With billions of parameters, collectively referred to as the network’s “weights,” modern neural networks are capable of computing highly complex transformations, such as converting text to video. The Office emphasizes that while code defines the basic structure of a neural network, “it is the weights that reflect patterns learned from the training data, and which are most likely to be treated as proprietary by developers or draw the scrutiny of copyright owners.” 2025 AI TRAINING REPORT, supra, at 6.
The Nature of Prompts and AI Systems
The Copyright Office’s 2025 copyrightability report provides clarity on how AI systems process prompts. A prompt is an input, often in text form, that communicates desired features of an output. 2025 AI COPYRIGHTABILITY REPORT, supra, at 5. The practice of crafting prompts optimized to elicit desired results is sometimes called “prompt engineering.” Id. at 5 n.22.
However, as the Office explains, current AI systems exhibit fundamental unpredictability. Outputs may vary from request to request, even with identical prompts. Id. at 7. Many describe AI as a “black box,” where even expert researchers cannot fully understand or predict specific model behavior. Id. at 6. Some systems now automatically optimize prompts internally, further reducing user control. Id.
Memorization and Reproduction
The technical question of whether models “memorize” training data has significant legal implications. The Copyright Office’s May 2025 report addresses this critical dispute about what happens to copyrighted works during training. While some AI companies assert that “there is no copy of the training data—whether text, images, or other formats—present in the model itself,” others point to numerous examples of models generating “verbatim, near identical, or substantially similar outputs.” 2025 AI TRAINING REPORT, supra, at 19.
The Office cites research by A. Feder Cooper and James Grimmelmann explaining that “the problem is that the ‘patterns’ learned by a model can be highly abstract, highly specific, or anywhere in between,” and where the learned pattern is highly specific, “the pattern is the memorized training data.” Id. at 20. The Office notes that considerable research has documented the extent of memorization, with factors influencing it including the number of model parameters, presence of duplicates in training data, whether an example is unusual or an outlier, and how broadly memorization is defined. Id. at 21.
Recent litigation has exposed this dispute. In Kadrey v. Meta Platforms, Inc., No. 23-cv-03417-VC, 2023 WL 8039640, at *1 (N.D. Cal. Nov. 20, 2023), the court dismissed claims that Meta’s Llama models were themselves infringing derivative works, calling such allegations “nonsensical.” But the court’s reasoning turned on the plaintiffs’ failure to allege that the models could generate copies of their works. The court explicitly distinguished cases where models can “spit out actual copies of their protected works.” Id. at *3.
This distinction proved decisive in Andersen v. Stability AI, where the court allowed certain copyright claims to proceed and accepted as plausible plaintiffs’ allegation that the Stable Diffusion model contains compressed representations that can enable reproduction of training images; it did not hold that model weights are per se infringing copies. Andersen v. Stability AI, No. 3:23-cv-00201-WHO, ECF 223 (N.D. Cal. Aug. 12, 2024).
Whether distributing trained weights constitutes distribution of a “copy” is unresolved; courts have not yet decided this question. This risk isn’t theoretical. Researchers have extracted verbatim text from GPT-2, near-identical images from Stable Diffusion, and recognizable code from GitHub Copilot. As the research paper “Extracting Training Data from Diffusion Models” documented, some images generated by Stable Diffusion were pixel-perfect copies of training images. Nicholas Carlini et al., Extracting Training Data from Diffusion Models, ARXIV (Jan. 30, 2023), https://arxiv.org/abs/2301.13188.
MAI Systems Corp. v. Peak Computer, Inc., 991 F.2d 511, 518 (9th Cir. 1993), held that loading software into RAM creates a copy for purposes of copyright law. However, that case addressed software code, not machine learning weights, and did not establish a broader “more-than-transitory” threshold for all digital copies. The application of MAI to AI model weights remains an open question that courts have not addressed.
Copyright Law Analysis
When Copyright Applies
Copyright automatically protects original creative works—virtually all text, images, music, and video online. No registration or copyright symbol is required. See 17 U.S.C. § 102(a) (2018) (defining copyrightable subject matter).
The Copyright Office’s May 2025 report confirms that using these works for AI training involves multiple acts of copying. Developers copy when they download and store training data, process and reformat files, load data during training, save trained model weights, and retrieve content for enhanced responses. Each act potentially infringes copyright unless covered by fair use or a license. 2025 AI TRAINING REPORT, supra, at 26-31; see also 17 U.S.C. § 106 (2018) (enumerating exclusive rights).
The Fair Use Framework
Fair use permits certain uses of copyrighted works without permission. Section 107 of the Copyright Act requires courts to weigh four factors to determine whether a particular use qualifies. 17 U.S.C. § 107 (2018). The Copyright Office’s May 2025 report provides comprehensive analysis of how these factors apply to AI training, examining each in detail while acknowledging that determinations must be made case-by-case. 2025 AI TRAINING REPORT, supra, at 32-74.
Fair use determinations are highly fact-specific. The March 2025 court decisions show that technical safeguards, data sources, and output capabilities play crucial roles. Using pirated materials may be “inherently, irredeemably infringing” according to Judge Alsup.
The March 2025 summary judgment decisions in Bartz v. Anthropic and Kadrey v. Meta provide the first substantive judicial analysis of these factors in the AI training context. Judge Alsup granted partial summary judgment to Anthropic on fair use grounds but denied the motion regarding pirated data claims, while Judge Chhabria granted summary judgment to Meta on training data issues, though based on failure of proof rather than a complete endorsement of the fair use defense.
Purpose and Character
The Supreme Court’s decision in Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith fundamentally reshaped this analysis. 598 U.S. 508, 529 (2023). The Court rejected the notion that adding new expression automatically makes a use transformative. Instead, Warhol focuses on the specific secondary use’s purpose. When Warhol’s estate licensed his Prince silkscreen for magazine publication—the same purpose as Goldsmith’s original photograph—the Court found the use non-transformative despite Warhol’s artistic modifications. Id. at 531-32.
Both Judge Alsup in Bartz and Judge Chhabria in Kadrey agreed that making copies of in-copyright works to train generative AI models generally has a transformative purpose. Book authors wrote their works to educate or entertain readers, whereas Anthropic and Meta had a different purpose—to use book contents as training data by statistically analyzing the books’ contents to train foundation models. Both judges thought these training purposes were highly transformative, though they cautioned that transformativeness alone doesn’t guarantee fair use.
The Copyright Office concludes that “training a generative AI foundation model on a large and diverse dataset will often be transformative,” as the process “converts a massive collection of training examples into a statistical model that can generate a wide range of outputs across a diverse array of new situations.” 2025 AI TRAINING REPORT, supra, at 45. However, the Office emphasizes that transformativeness is a matter of degree, with uses falling along a spectrum. Training for research or deployment in closed systems for non-substitutive tasks represents the most transformative end, while training to generate outputs substantially similar to copyrighted works in datasets represents the least transformative use. Id. at 46.
The Office specifically rejects arguments that AI training is inherently transformative because it serves “non-expressive” purposes, noting that “language models are trained on examples that are hundreds of thousands of tokens in length, absorbing not just the meaning and parts of speech of words, but how they are selected and arranged at the sentence, paragraph, and document level—the essence of linguistic expression.” Id. at 47. The Office also dismisses the analogy between AI and human learning, emphasizing that “AI learning is different from human learning in ways that are material to the copyright analysis,” as generative AI training “involves the creation of perfect copies with the ability to analyze works nearly instantaneously” and can “create at superhuman speed and scale.” Id. at 48.
A significant judicial split has emerged regarding the use of pirated materials. Judge Alsup was highly critical of Anthropic’s use of pirated books to train its models, stating that “piracy of otherwise available copies is inherently, irredeemably infringing even if the pirated copies are immediately used for the transformative use and immediately discarded.” By contrast, Judge Chhabria found that Meta’s use of pirated books didn’t “move the needle” on its fair use claim, viewing the issue as neither dispositive nor irrelevant. He noted that the Supreme Court has twice suggested that what matters is whether the challenged use is objectively fair, not whether the putative fair user was a good or bad faith actor.
This precedent directly challenges AI companies’ arguments that statistical analysis of copyrighted works is inherently transformative. While Authors Guild v. Google, Inc., 804 F.3d 202, 216-17 (2d Cir. 2015), found that digitizing books to create a searchable database served a transformative purpose, that case involved providing information about books rather than generating new creative content. The distinction matters. Google Books helped users find books to purchase; generative AI creates content that competes with its training materials.
Nature of the Work
The Second Circuit in Authors Guild gave this factor little weight, noting that Google had copied both factual and creative works. 804 F.3d at 220. However, as the Copyright Office notes, AI training presents different considerations. Language models specifically seek creative, well-written content—novels, poetry, and professional journalism—precisely because of their expressive qualities. This deliberate selection of creative works should weigh more heavily against fair use than the incidental inclusion of creative works in Google’s comprehensive book-scanning project. 2025 AI TRAINING REPORT, supra, at 54.
Both judges in the March 2025 decisions agreed that the nature factor cut against fair use because Anthropic and Meta had chosen to use plaintiffs’ books specifically because of their expressiveness. The highly creative nature of the works placed them closer to the “core” of copyright protection.
Amount Used
The Ninth Circuit’s approach in Sony Computer Entertainment, Inc. v. Connectix Corp., 203 F.3d 596, 605 (9th Cir. 2000), provides the most relevant framework. That court held that copying an entire work carries “very little weight” when the final product doesn’t contain the copyrighted expression. But Connectix involved reverse engineering to access unprotectable functional elements. AI training seeks to absorb expressive elements—the very style, structure, and linguistic patterns that copyright protects.
Both judges in the March 2025 decisions found that Anthropic and Meta acted reasonably in copying the entirety of plaintiffs’ works for training data purposes, given their transformative purpose. The Copyright Office acknowledges that “AI developers ordinarily copy entire works and make use of their expressive content for training,” which generally weighs against fair use. 2025 AI TRAINING REPORT, supra, at 55. However, the Office notes that “the use of entire works appears to be practically necessary for some forms of training for many generative AI models,” particularly for “internet-scale pre-training data, including large amounts of entire works” needed to achieve current-generation model performance. Id. at 57.
Critically, the Office and recent court decisions emphasize the importance of technical safeguards. Judge Chhabria specifically noted that Meta’s implementation of output filters preventing regurgitation of substantial expression from training data was significant in finding no market harm through lost sales. Where developers adopt “adequate safeguards to limit the exposure of copyrighted material,” including “input filters that block user prompts likely to result in generations that reproduce copyrighted content,” “training techniques designed to make infringing outputs less likely,” and “output filters that block copyrighted content from being displayed,” the third factor may weigh less heavily against fair use. Id. at 59-60.
Market Effect
The Supreme Court in Campbell v. Acuff-Rose Music, Inc. directed courts to consider not only “the extent of market harm caused by the particular actions of the alleged infringer,” but also “whether unrestricted and widespread conduct of the sort engaged in by the defendant . . . would result in a substantially adverse impact on the potential market” for the original. 510 U.S. 569, 590 (1994). The June 2025 Kadrey decision revealed significant judicial analysis of how to assess market harm in the AI context.
Judge Chhabria identified three potential theories of market harm from AI training. First, direct substitution occurs when models output verbatim or substantially similar copies of protected works. Second, lost licensing fees result when developers bypass available licensing markets. Third, market dilution occurs when AI systems flood markets with competing works that, while not infringing, reduce demand for human-created works.
Neither judge in the 2025 decisions found the lost license fee theory persuasive. Both judges stated that training data uses of books represent a market that authors are not entitled to control for transformative purpose uses, citing precedent that harm from lost fees for transformative uses is not cognizable under the fourth factor. See Campbell, 510 U.S. at 591-92. Judge Chhabria characterized Kadrey’s argument about harm from enabling output infringements as a “clear loser,” noting that Meta had developed effective output filters preventing regurgitation of more than small chunks of expression from books ingested as training data.
However, Judge Chhabria provided extensive analysis of the market dilution theory, which he found far more compelling than the other theories of harm. He explained that AI systems capable of generating “endless amounts of biographies,” “magazine articles,” or genre fiction could “severely harm” markets for human-authored works. The judge noted that “the market for the typical human-created romance or spy novel could be diminished substantially by the proliferation of similar AI-created works,” which would “presumably diminish the incentive for human beings to write romance or spy novels in the first place.”
Judge Chhabria explicitly rejected arguments that market dilution does not count under the fourth factor, stating that “indirect substitution is still substitution: If someone bought a romance novel written by an LLM instead of a romance novel written by a human author, the LLM-generated novel is substituting for the human-written one.” He distinguished this from non-cognizable harm caused by criticism or commentary, which can harm demand without serving as a replacement. Cf. Campbell, 510 U.S. at 591-92 (criticism that “kills demand for the original” does not produce cognizable harm under the Copyright Act).
The judge emphasized that generative AI presents unprecedented challenges for copyright law, observing that “no other use—whether it’s the creation of a single secondary work or the creation of other digital tools—has anything near the potential to flood the market with competing works the way that LLM training does.” He suggested that “it seems likely that market dilution will often cause plaintiffs to decisively win the fourth factor—and thus win the fair use question overall—in cases like this.”
Judge Chhabria’s “market dilution” theory represents a novel approach: AI systems flooding markets with competing works could “severely harm” human creators even without direct copying. He suggests this factor alone could often defeat fair use claims.
Despite finding the market dilution theory compelling, Judge Chhabria granted summary judgment to Meta because the plaintiffs failed to present any evidence supporting this theory. He noted that the plaintiffs “never so much as mentioned” market dilution in their complaint, provided no analysis of markets for their books, offered no discussion of whether AI-generated books compete in these markets, and presented no evidence about actual or likely future impact on sales. The court concluded that “speculation is insufficient to raise a genuine issue of fact and defeat summary judgment.” Kadrey v. Meta Platforms, Inc., 2025 WL 1752484, at *23-24 (N.D. Cal. June 25, 2025).
Judge Alsup, by contrast, dismissed the market dilution theory as “science fiction” during oral argument in the Bartz case, comparing AI training to “training schoolchildren to write well,” which could also “result in an explosion of competing works.” Bartz v. Anthropic PBC, No. C 24-05417 WHA (N.D. Cal. Mar. 15, 2025) (oral argument transcript). This stark disagreement between the two judges highlights a fundamental question about copyright’s role in protecting human creators from non-infringing competition that will likely require appellate resolution.
The Copyright Office identifies multiple forms of potential market harm. First, lost sales can occur when models “output verbatim or substantially similar copies of the works trained on” that are “readily accessible by end users.” 2025 AI TRAINING REPORT, supra, at 63. Second, market dilution threatens creators as “the speed and scale at which AI systems generate content pose a serious risk of diluting markets for works of the same kind as in their training data,” meaning “more competition for sales of an author’s works and more difficulty for audiences in finding them.” Id. at 65. Third, lost licensing opportunities represent significant harm where “voluntary licensing is already happening in some sectors” and appears “reasonable or likely to be developed in others.” Id. at 67, 70.
Perhaps most significantly, the rapidly developing licensing market undermines fair use claims under the fourth factor. In American Geophysical Union v. Texaco Inc., the Second Circuit rejected Texaco’s fair use defense for photocopying scientific articles partly because a licensing mechanism existed through the Copyright Clearance Center. 60 F.3d 913, 929-31 (2d Cir. 1995). The court found that bypassing available licenses weighed against fair use even though the individual researchers might have had transformative purposes.
Today’s AI licensing market dwarfs what existed in Texaco. Major publishers, image libraries, and news organizations have established licensing programs specifically for AI training. OpenAI, Google, and other companies have signed deals worth hundreds of millions of dollars. See 2025 AI TRAINING REPORT, supra, at 103-06 (documenting emergence of AI licensing markets). Under Texaco‘s logic, the existence of these markets creates a presumption against fair use for unlicensed training, particularly for commercial developers who can afford licenses. Cf. 60 F.3d at 930-31.
Applying Fair Use to AI Training
The Copyright Office concludes that “the fair use determination requires balancing multiple statutory factors in light of all relevant circumstances” and that “the first and fourth factors can be expected to assume considerable weight in the analysis.” 2025 AI TRAINING REPORT, supra, at 74. The Office expects that “some uses of copyrighted works for generative AI training will qualify as fair use, and some will not.” Id.
The March 2025 summary judgment decisions confirm this nuanced approach. Judge Alsup’s ruling in Bartz gives a greener light to AI training uses of works, at least if done without the use of pirated books, than does Judge Chhabria’s decision in Kadrey. The decisions establish that courts will need to distinguish between fundamentally different uses of copyrighted works in AI development, with technical safeguards, data sources, and output capabilities playing crucial roles in the analysis.
Research and Analysis Uses
Training that most closely resembles the transformative use approved in Authors Guild, Inc. v. HathiTrust (2014) involves models deployed solely for non-expressive purposes. When universities digitized millions of books to enable full-text searching for research, the Second Circuit found this “quintessentially transformative” because it provided information about works without supplying market substitutes. 755 F.3d 87, 97 (2d Cir. 2014). HathiTrust involved non-substitutive search and accessibility uses. Similarly, training AI models for scientific analysis, accessibility tools, or content moderation serves different purposes than the expressive works used in training. These uses strengthen the fair use claim.
The Copyright Office notes that “on one end of the spectrum, uses for purposes of noncommercial research or analysis that do not enable portions of the works to be reproduced in the outputs are likely to be fair.” 2025 AI TRAINING REPORT, supra, at 74. The Office emphasizes that training is “most transformative when the purpose is to deploy it for research, or in a closed system that constrains it to a non-substitutive task.” Id. at 46.
The key distinction from HathiTrust lies in output capabilities. The HathiTrust database could only return search results and page numbers—it couldn’t generate new novels. When AI models can produce creative content, they move beyond HathiTrust‘s protective scope into more dangerous territory under Warhol.
Commercial Content Generation
At the opposite extreme, using copyrighted works to train models that generate competing content faces serious obstacles under current precedent. The Second Circuit’s decision in Fox News Network, LLC v. TVEyes, Inc. is instructive. 883 F.3d 169, 177-80 (2d Cir. 2018). Although TVEyes served a transformative research purpose by helping users monitor television coverage, the court rejected fair use because the service provided Fox’s content in a format that could substitute for Fox’s own offerings.
The Copyright Office concludes that “on the other end of the spectrum, the copying of expressive works from pirate sources in order to generate unrestricted content that competes in the marketplace, when licensing is reasonably available, is unlikely to qualify as fair use.” 2025 AI TRAINING REPORT, supra, at 74. Judge Alsup’s strong condemnation of using pirated materials in Bartz reinforces this position, finding that Anthropic should have purchased print copies of the plaintiffs’ books to build its library and ruling against Anthropic on all four fair use factors for downloading and storing pirated copies.
Judge Chhabria’s market dilution theory, if adopted by appellate courts, could significantly expand the scope of market harm analysis. His suggestion that the genAI industry would find a way to pay copyright owners for training uses, stating that “it seems especially likely that these licensing markets will arise if LLM developers’ only choices are to get licenses or forgo the use of copyrighted books as training data,” signals potential judicial receptiveness to monetary remedies rather than injunctive relief. Kadrey v. Meta Platforms, Inc., 2025 WL 1752484, at *38 (N.D. Cal. June 25, 2025).
The Licensing Market Reality
The rapidly developing licensing market undermines fair use claims under the fourth factor. The Copyright Office documents that “voluntary licensing is already happening in some sectors, and it appears reasonable or likely to be developed in others—at least for certain types of works, training, and models.” 2025 AI TRAINING REPORT, supra, at 73. However, both judges in the March 2025 decisions rejected the argument that the existence of some licensing deals creates an obligation to license all training data, noting the impracticality of millions of individual license transactions given the scale of data needed for sophisticated models.
The DMCA and Fair Use Intersection
The June 2025 DMCA ruling in Kadrey v. Meta Platforms, Inc., No. 23-cv-03417-VC, 2025 WL 1786418 (N.D. Cal. June 27, 2025), provides crucial clarification about the relationship between fair use and copyright management information (CMI) removal. The court granted Meta’s motion for partial summary judgment on the plaintiffs’ DMCA claim under 17 U.S.C. § 1202(b)(1), holding that because Meta’s copying constituted fair use as a matter of law, its removal of CMI could not violate the DMCA.
The court’s reasoning rests on fundamental statutory interpretation. Section 1202(b)(1) prohibits the intentional removal of CMI by one who knows or has reason to know that the removal will “induce, enable, facilitate, or conceal an infringement.” Since the Copyright Act provides that fair use “is not an infringement of copyright” under Section 107, and Meta’s copying was deemed fair use, there could be no underlying infringement for the CMI removal to further. Id.
Judge Chhabria emphasized two additional policy reasons supporting this interpretation. First, it would be incongruous for Congress to exempt secondary users who make fair use from infringement liability, only to subject them to DMCA liability for removing boilerplate text in the process. Second, Section 1202 can give rise to criminal liability under Section 1204(a) where CMI removal is willful and done for commercial purposes. The court found it “inconceivable that criminal liability would attach to an act that was done in furtherance of a noninfringing fair use.” Id. at 2.
The court rejected the reasoning in Murphy v. Millennium Radio Group, 2015 WL 419884 (D.N.J. Jan. 30, 2015), which had held that DMCA claims could proceed even where the underlying use was fair. Judge Chhabria found Murphy’s three justifications unpersuasive, particularly noting that in the Ninth Circuit, one cannot intend to abet an infringement where they intend to abet something they believe is—and that actually is—fair use, citing Evergreen Safety Council v. RSA Network Inc.. 697 F.3d 1221, 1228 (9th Cir. 2012).
This ruling has significant practical implications for AI developers. It suggests that establishing fair use not only protects against direct infringement claims but also shields against DMCA claims related to the removal of copyright notices, watermarks, and other metadata during the training process. However, developers should note that this protection only extends to uses that qualify as fair use—unauthorized uses that don’t meet the fair use standard could still face both infringement and DMCA liability.
Copyright Protection for AI-Generated Content
The Human Authorship Requirement
The Copyright Office has definitively established that copyright protection extends only to works created by human authors. This fundamental principle, rooted in both constitutional and statutory interpretation, profoundly affects the commercial value and legal status of AI-generated content.
The Supreme Court established this principle in Burrow-Giles Lithographic Co. v. Sarony, defining an “author” as “he to whom anything owes its origin; originator; maker; one who completes a work of science or literature.” The Court repeatedly characterized authors as human, describing copyright as “the exclusive right of a man to the production of his own genius or intellect.” 111 U.S. 53, 57-58 (1884).
Federal appellate courts have reinforced this interpretation. The Ninth Circuit held in Naruto v. Slater that a monkey cannot register copyright in photographs because the Copyright Act’s references to an author’s “children,” “widow,” and “widower” necessarily imply human authorship. 888 F.3d 418, 426 (9th Cir. 2018). Similarly, in Urantia Foundation v. Kristen Maaherra, the court held that words “authored by non-human spiritual beings” could only qualify for copyright if there was “human selection and arrangement of the revelations.” 114 F.3d 955, 957-59 (9th Cir. 1997). The Seventh Circuit has squarely held that authors “of copyrightable works must be human.” Kelley v. Chicago Park Dist., 635 F.3d 290, 304 (7th Cir. 2011).
In 2023, the District Court for the District of Columbia became the first court to specifically address AI-generated outputs. In Thaler v. Perlmutter, the court found that “human authorship is a bedrock requirement of copyright” and that copyright has never protected “works generated by new forms of technology operating absent any guiding human hand.” 687 F. Supp. 3d 140, 146 (D.D.C. 2023). The D.C. Circuit affirmed this decision on March 18, 2025, reiterating the human-authorship requirement. Thaler v. Perlmutter, 130 F.4th 1039, 1044-45 (D.C. Cir. 2025) (holding that “the Copyright Act requires all work to be authored in the first instance by a human being”).
Copyright Office 2025 Guidance on Prompts
On January 2025, the Copyright Office issued comprehensive guidance definitively establishing that prompts alone cannot confer authorship of AI-generated outputs. 2025 AI COPYRIGHTABILITY REPORT, supra, at 18-22. The Office concluded that “given current generally available technology, prompts alone do not provide sufficient human control to make users of an AI system the authors of the output.” Id. at 18.
The Office explained that prompts “essentially function as instructions that convey unprotectible ideas” and that while “highly detailed prompts could contain the user’s desired expressive elements, at present they do not control how the AI system processes them in generating the output.” Id. at 18-19. The gaps between prompts and outputs demonstrate that “the user lacks control over the conversion of their ideas into fixed expression, and the system is largely responsible for determining the expressive elements in the output.” Id. at 19.
The Office specifically rejected the theory of “authorship by adoption,” finding that selecting an AI-generated output among uncontrolled options “is more analogous to curating a ‘living garden,’ than applying splattered paint.” Id. at 21 (citing Kelley v. Chicago Park Dist., 635 F.3d 290, 304 (7th Cir. 2011)). Repeatedly revising prompts does not change this analysis, as it amounts to “‘re-rolling’ the dice, causing the system to generate more outputs from which to select, but not altering the degree of control over the process.” Id. at 20.
The Office acknowledged that technological advances might someday provide users with sufficient control over expressive elements through prompts, but concluded that current technology does not meet this threshold. Id. at 21-22.
The Copyright Office definitively states that prompts alone cannot confer authorship of AI-generated outputs, regardless of complexity or iteration. Current technology does not provide sufficient human control over expressive elements through prompts.
Limited Protection for Human Contributions
The 2025 guidance identifies three scenarios where human contributions to AI-generated outputs may receive copyright protection. First, where human authors input their own copyrightable works and those works remain perceptible in the output, they retain authorship of those portions. Second, humans who modify AI-generated material substantially enough that the modifications meet copyright’s originality standard may claim authorship in those modifications. Third, humans may claim copyright in the creative selection, coordination, and arrangement of AI-generated material, though protection extends only to the compilation, not the underlying AI-generated elements. 2025 AI COPYRIGHTABILITY REPORT, supra, at 22-25.
Assistive Uses of AI Systems
The Office drew a crucial distinction between using AI as a tool to assist creation versus using AI as a stand-in for human creativity. Id. at 11-12. When AI enhances human expression through assistive uses, copyright protection remains available. Id. This includes uses such as aging or de-aging actors, identifying chord progressions, detecting software errors, and removing unwanted objects from scenes. Id. at 11 n.63.
The Office specifically addressed creators with disabilities, emphasizing that AI functionalities used as tools “to recast, transform, or adapt an author’s expression” support copyright protection for the resulting work. Id. at 38. The Office cited the example of country artist Randy Travis, who used AI to recreate his voice after a stroke left him with limited speech function. Because the AI functioned as a tool rather than generating expression, the Office registered the work. Id.
Commercial Implications
The lack of copyright protection for AI-generated content creates significant commercial uncertainties. Companies cannot claim exclusive rights to content generated purely by AI systems. Competitors may freely copy and use AI-generated materials unless sufficient human creative control establishes limited protection. This fundamentally affects business models built on AI content generation, potentially reducing the value of AI-generated assets and complicating content licensing arrangements.
The Office’s 2025 report addresses policy arguments for extending protection to AI-generated content and concludes that “the case has not been made for additional protection for AI-generated material beyond that provided by existing law.” Id. at 36. The Office expressed particular concern that “if a flood of easily and rapidly AI-generated content drowns out human-authored works in the marketplace, additional legal protection would undermine rather than advance the goals of the copyright system.” Id. at 36-37.
Protection Against Unauthorized Digital Replicas
The Emergence of Digital Replica Concerns
The Copyright Office’s July 2024 report addresses the urgent need for protection against unauthorized digital replicas. From AI-generated musical performances to robocall impersonations of political candidates to images in pornographic videos, an era of sophisticated digital replicas has arrived. Although technologies have long been available to produce fake images or recordings, generative AI technology’s ability to do so easily, quickly, and with uncanny verisimilitude has drawn the attention and concern of creators, legislators, and the general public. 2024 DIGITAL REPLICAS REPORT, supra, at 1-2.
The Office uses the term “digital replica” to refer to a video, image, or audio recording that has been digitally created or manipulated to realistically but falsely depict an individual. A “digital replica” may be authorized or unauthorized and can be produced by any type of digital technology, not just AI. The terms “digital replicas” and “deepfakes” are used interchangeably. Id. at 2.
Current Harms from Unauthorized Digital Replicas
Digital replicas may have both beneficial and harmful uses. On the positive side, they can serve as accessibility tools for people with disabilities, enable “performances” by deceased or non-touring artists, support creative work, or allow individuals to license and be compensated for the use of their voice, image, and likeness. In one noted example, musician Randy Travis, who has limited speech function since suffering a stroke, was able to use generative AI to release his first song in over a decade. Id. at 3.
At the same time, a broad range of actual or potential harms arising from unauthorized digital replicas has emerged. Across the creative sector, the surge of voice clones and image generators has stoked fears that performers and other artists will lose work or income. There have already been film projects that use digital replica extras in lieu of background actors, and situations where voice actors have been replaced by AI replicas. Within the music industry, concerns have been raised that the use of AI in sound recordings could lead to the “loss of authenticity and creativity” and displacement of human labor. Id. at 3-4.
While digital replicas depicting well-known individuals often attract the most attention, anyone can be vulnerable. Beyond the creative sector, the harms from unauthorized digital replicas largely fall into three categories. First, there have been many reports of generative AI systems being used to produce sexually explicit deepfake imagery. In 2023, researchers concluded that explicit images make up 98% of all deepfake videos online, with 99% of the individuals represented being women. Instances of students creating and posting deepfake explicit images of classmates appear to be multiplying. Id. at 4.
Second, the ability to create deepfakes offers a “potent means to perpetrate fraudulent activities with alarming ease and sophistication.” The media has reported on scams in which defrauders replicated the images and voices of a multinational financial firm’s CEO and its employees to steal $25.6 million; replicated loved ones’ voices to demand a ransom; and replicated the voice of an attorney’s son asking him to wire $9,000 to post a bond. Digital replicas of celebrities have been used to falsely portray them as endorsing products. Id. at 5.
Finally, there is a danger that digital replicas will undermine our political system and news reporting by making misinformation impossible to discern. Recent examples involving politicians include a voice replica of a Chicago mayoral candidate appearing to condone police brutality; a robocall with a replica of President Biden’s voice discouraging voters from participating in a primary election; and a campaign ad that used AI-generated images to depict former President Trump appearing with former Director of the National Institute of Allergy and Infectious Diseases, Anthony Fauci. Deepfake videos were even used to influence a high profile union vote by falsely showing a union leader urging members to oppose the contract that he had “negotiated and . . . strongly supported.” Id. at 5-6.
Summarizing the challenges to the information ecosystem, one digital forensics scholar cautioned, “[i]f we enter a world where any story, any audio recording, any image, any video can be fake . . . then nothing has to be real.” As AI technology continues to improve, researchers predict that it will become increasingly difficult to distinguish between digital replicas and authentic content. Id. at 6.
Existing Legal Frameworks and Their Limitations
The Office’s July 2024 report outlines the main existing legal frameworks: state rights of privacy and publicity, including recent legislation specifically targeting digital replicas, and at the federal level, the Copyright Act, the Federal Trade Commission Act, the Communications Act, and the Lanham Act. Id. at 8-22.
State laws are both inconsistent and insufficient in various respects. Some states currently do not provide rights of publicity and privacy, while others only protect certain categories of individuals. Multiple states require a showing that the individual’s identity has commercial value. Not all states’ laws protect an individual’s voice; those that do may limit protection to distinct and well-known voices, to voices with commercial value, or to use of actual voices without consent (rather than a digital replica). Id. at 23.
State right of publicity laws typically apply only where the infringement occurs in advertising, on merchandise, or for other commercial purposes. They do not address the harms that can be inflicted by non-commercial uses, including deepfake pornography, which are particularly prevalent in the internet environment. Different jurisdictional requirements create discrepancies as to who may seek relief. Finally, some of these laws incorporate broad exceptions that may go beyond First Amendment requirements and place many unauthorized uses outside their scope. As numerous commenters noted, the result is a patchwork of protections, with the availability of a remedy dependent on where the affected individual lives or where the unauthorized use occurred. Id. at 23-24.
Existing federal laws are too narrowly drawn to fully address the harm from today’s sophisticated digital replicas. The Copyright Act protects original works of authorship but does not prevent the unauthorized duplication of an individual’s image or voice alone, and the targeted individual may not be an owner of copyright in the work as a whole. The Federal Trade Commission Act prohibits unfair or deceptive acts or practices in or affecting commerce. While it can be applied to cases where digital replicas are used in commercially misleading ways, it does not provide comprehensive protection in other circumstances. Similarly, under the Lanham Act, claims such as false endorsement involving a digital replica are limited to unauthorized commercial uses, and most federal courts also require a showing of consumer awareness of the depicted individual in order to establish a likelihood of confusion, limiting the Lanham Act’s protection to well-known figures and commercial circumstances. It may be difficult for many individuals, including less famous artists and performers, to prove that the challenged conduct is likely to confuse consumers regarding the plaintiff’s association with, or approval of, the defendant’s commercial activities. And issues like AI-generated “revenge porn” would likely fall beyond its reach. Id. at 24.
Copyright Office Recommendations for Federal Legislation
Having concluded that a new law is needed, the Office makes the following recommendations regarding its contours:
Subject Matter. The statute should target those digital replicas, whether generated by AI or otherwise, that are so realistic that they are difficult to distinguish from authentic depictions. Protection should be narrower than, and distinct from, the broader “name, image, and likeness” protections offered by many states. Id. at iv, 29.
Persons Protected. The statute should cover all individuals, not just celebrities, public figures, or those whose identities have commercial value. Everyone is vulnerable to the harms that unauthorized digital replicas can cause, regardless of their level of fame or prior commercial exposure. Id. at iv, 29-30.
Term of Protection. Protection should endure at least for the individual’s lifetime. Any postmortem protection should be limited in duration, potentially with the option to extend the term if the individual’s persona continues to be exploited. Id. at iv, 30-33.
Infringing Acts. Liability should arise from the distribution or making available of an unauthorized digital replica, but not the act of creation alone. It should not be limited to commercial uses, as the harms caused are often personal in nature. It should require actual knowledge both that the representation was a digital replica of a particular individual and that it was unauthorized. Id. at iv, 33-36.
Secondary Liability. Traditional tort principles of secondary liability should apply. The statute should include a safe harbor mechanism that incentivizes online service providers to remove unauthorized digital replicas after receiving effective notice or otherwise obtaining knowledge that they are unauthorized. Id. at iv, 36-39.
Licensing and Assignment. Individuals should be able to license and monetize their digital replica rights, subject to guardrails, but not to assign them outright. Licenses of the rights of minors should require additional safeguards. Id. at iv, 39-42.
First Amendment Concerns. Free speech concerns should expressly be addressed in the statute. The use of a balancing framework, rather than categorical exemptions, would avoid overbreadth and allow greater flexibility. Id. at iv, 43-47.
Remedies. Effective remedies should be provided, both injunctive relief and monetary damages. The inclusion of statutory damages and/or prevailing party attorney’s fees provisions would ensure that protection is available to individuals regardless of their financial resources. In some circumstances, criminal liability would be appropriate. Id. at iv, 47-48.
Relationship to State Laws. Given well-established state rights of publicity and privacy, the Office does not recommend full federal preemption. Federal law should provide a floor of consistent protection nationwide, with states continuing to be able to provide additional protections. It should be clarified that section 114(b) of the Copyright Act does not preempt or conflict with laws restricting unauthorized voice digital replicas. Id. at iv-v, 48-52.
Protection of Artistic Style
The Office received many comments seeking protection against AI “outputs that imitate the artistic style of a human creator.” Commenters voiced concern over the ability of an AI system, in response to a text prompt asking for an output “in the style of artist X,” to quickly produce a virtually unlimited supply of material evoking the work of a particular author, visual artist, or musician. They asserted that these outputs can harm, and in some cases have already harmed, the market for that creator’s works. Id. at 53.
The Office acknowledges the seriousness of these concerns and believes that appropriate remedies should be available for this type of harm. Copyright law’s application in this area is limited, as it does not protect artistic style as a separate element of a work. As noted by several commenters, copyright protection for style would be inconsistent with section 102(b)‘s idea/expression dichotomy. Moreover, in most cases the elements of an artist’s style cannot easily be delineated and defined separately from a particular underlying work. Id. at 53-55.
The Copyright Act may, however, provide a remedy where the output of an “in the style of” request ends up replicating not just the artist’s style but protectible elements of a particular work. Additionally, as future Parts of this Report will discuss, there may be situations where the use of an artist’s own works to train AI systems to produce material imitating their style can support an infringement claim. Id. at 55.
Numerous commenters pointed out that meaningful protections against imitations of style may be found in other legal frameworks, including the Lanham Act’s prohibitions on passing off and unfair competition. Given these resources, as well as the policy reasons not to extend property-like rights to style in itself, the Office does not recommend including style as protected subject matter under a federal digital replica law at this time. If existing protections prove inadequate, this conclusion may be revisited. Id. at 55-56.
International Approaches and Treaty Obligations
The international landscape reveals fundamentally different approaches to AI training and copyrightability, creating compliance challenges for global companies and potential treaty conflicts for the United States. The Copyright Office’s May 2025 report provides comprehensive analysis of these divergent strategies. 2025 AI TRAINING REPORT, supra, at 76-84.
Global Consensus on Human Authorship
The Copyright Office’s 2025 report documents emerging international consensus that copyright requires human authorship. 2025 AI COPYRIGHTABILITY REPORT, supra, at 28-31. Korea’s Copyright Commission stated that “only a natural person can become an author” and that “copyright registration for an AI output is impossible if a human did not contribute creatively to the expressive form.” Id. at 28. Japan’s Copyright Subdivision guidelines explain that copyrightability depends on factors including the amount and content of instructions, number of generation attempts, selection from outputs, and subsequent human additions. Id.
In November 2023, the Beijing Internet Court recognized copyright in an image created with Stable Diffusion based on numerous prompts and parameter adjustments evidencing human intellectual input. Id. at 28-29. The European Union member states agreed in 2024 that AI-generated content may be eligible for copyright “only if the human input in [the] creative process was significant.” Id. at 29.
The European Framework on Training
Directive 2019/790 of the European Parliament and of the Council of 17 April 2019 on Copyright and Related Rights in the Digital Single Market establishes the framework for text and data mining relevant to AI training. Article 3 permits text and data mining for research organizations and cultural heritage institutions, while Article 4 allows commercial text and data mining subject to rights-holder opt-outs. 2019 O.J. (L 130) 92, arts. 3-4. Article 4 allows commercial entities to use copyrighted works subject to copyright owner opt-outs. This reverses the traditional copyright framework—instead of requiring permission before use, it permits use until objection.
The Hamburg Regional Court applied the research TDM exception to aspects of LAION’s dataset for non-commercial research in Kneschke v. LAION, LG Hamburg, 310 O 227/23 (Sept. 27, 2024) (Ger.). The decision is being analyzed and is subject to appeal and clarification, with commentary differing on its breadth.
This opt-out approach may violate the three-step test in the Berne Convention for the Protection of Literary and Artistic Works art. 9(2), Sept. 9, 1886, as revised July 24, 1971, 25 U.S.T. 1341, which the United States must follow as a treaty party. The Convention permits exceptions only for “certain special cases” that don’t conflict with normal exploitation or “unreasonably prejudice the legitimate interests” of authors. Mass commercial training on all available works unless owners object hardly constitutes a “special case.” It directly conflicts with normal exploitation by bypassing licensing markets and prejudices authors who cannot effectively monitor and object to every AI company’s use of their works.
The EU’s opt-out approach for commercial AI training may violate the Berne Convention’s three-step test, which the U.S. must follow as a treaty party. This creates potential compliance conflicts for global AI companies.
The AI Act entered into force on August 1, 2024, with General-Purpose AI (GPAI) obligations under Article 53 applying from August 2, 2025. These obligations include requirements for a copyright-compliance policy that honors DSM Article 4(3) opt-outs and publication of a sufficiently detailed summary of training content. Regulation (EU) 2024/1689, art. 53, 2024 O.J. (L 1689) 1. The recently published voluntary Code of Practice provides operational guidance on transparency, copyright compliance, and safety measures.
Japan’s “Non-Enjoyment” Exception
Japan’s Copyright Act permits data analysis under Article 30-4, but excludes uses aimed at enabling enjoyment of the expression. Japan’s Copyright Act, Law No. 48 of 1970, art. 30-4 (Japan). The Agency for Cultural Affairs’ May 2024 “General Understanding on AI and Copyright in Japan” clarifies that training for analysis is generally permitted, while uses aimed at enabling enjoyment of the expression—including fine-tuning intended to output a work and certain RAG configurations—are not covered. Agency for Cultural Affairs, General Understanding on AI and Copyright in Japan 12-15 (May 2024). The Copyright Office notes that Japan’s exception “allows the use of a copyrighted work for AI development or other forms of data analysis as long as the use is not to ‘personally enjoy...the thoughts or sentiments expressed in that work.’” 2025 AI TRAINING REPORT, supra, at 78. This distinction between analytical and generative uses aligns with the emerging U.S. framework distinguishing research from competitive uses. However, Japan’s approach provides clearer ex ante guidance than the multi-factor fair use analysis.
The Competition for AI Leadership
China’s approach remains deliberately ambiguous. The Beijing Internet Court decision imposed liability for platforms enabling infringing output generation but didn’t address foundation model training directly. Meanwhile, China’s administrative measures require AI services to respect intellectual property rights without specifying how this applies to training data. The Copyright Office observes that this ambiguity may be strategic, allowing China to “observe which approach better promotes AI development without committing to either framework,” providing “competitive advantages if American or European requirements prove too restrictive or too permissive.” 2025 AI TRAINING REPORT, supra, at 81.
United Kingdom Developments
As of August 2025, the UK has consulted on an EU-style TDM exception with opt-out but has not legislated. The Data (Use and Access) Act 2025 addresses data access and processing, not copyright exceptions for AI training. Heavy pushback from creators has stalled legislative action, with the government stating it will not legislate until workable reservation mechanisms exist. The consultation reflects ongoing tension between fostering AI innovation and protecting creative rights. The Copyright Office notes that the UK’s proposed approach “has proved quite controversial, with commenters warning that it would impose burdensome transaction costs for both copyright owners and AI developers.” 2025 AI TRAINING REPORT, supra, at 79.
The Nature of Creativity in Copyright Law
Defining Human Creativity
Copyright law has long grappled with defining creativity, establishing thresholds that distinguish protectable expression from unprotectable ideas, facts, and mechanical outputs. The Supreme Court in Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340, 345 (1991), established that copyright requires a “modicum of creativity,” rejecting the “sweat of the brow” doctrine that would have protected works based solely on effort or labor.
This creativity standard, though minimal, requires human judgment and choice. As Justice O’Connor wrote in Feist, even a compilation of facts can meet this threshold through creative selection or arrangement, but the creativity must originate from human decision-making. Id. at 348. The Court emphasized that originality means the work was independently created by the author and possesses at least some minimal degree of creativity. Id. at 345.
The Copyright Office’s examination of AI-generated works applies this creativity framework rigorously. When evaluating works containing AI-generated material, the Office asks whether the traditional elements of authorship were “conceived and executed” by a human or by a machine. U.S. COPYRIGHT OFFICE, COMPENDIUM OF U.S. COPYRIGHT OFFICE PRACTICES § 313.2 (3d ed. 2021). This inquiry goes to the heart of what constitutes creativity under copyright law—the conscious choices, aesthetic judgments, and expressive decisions that only humans can make.
The Creative Process Versus Creative Output
A critical distinction emerges between the creative process and creative output. While AI systems can produce outputs that appear creative, they lack the conscious intentionality that copyright law requires. The Copyright Office recognizes that using technological tools in the creative process does not negate human authorship—artists have always employed tools, from paintbrushes to Photoshop. 88 Fed. Reg. 16190, 16193 (Mar. 16, 2023). The determinative question is whether a human exercised creative control over the expressive elements.
Consider the difference between a photographer using a camera and a person prompting an AI image generator. The photographer makes creative choices about composition, lighting, angle, and timing—what the Supreme Court in Burrow-Giles called the “original mental conception” given “visible form.” 111 U.S. at 60. By contrast, when someone provides a text prompt to an AI system, the machine determines the actual expression—the specific visual elements, their arrangement, and their execution. The prompter may have a creative idea, but copyright protects expression, not ideas. See 17 U.S.C. § 102(b) (2018) (excluding ideas from copyright protection).
This distinction becomes particularly significant in iterative AI workflows. Even when users provide feedback through multiple prompts to refine AI outputs, the Copyright Office maintains that the AI system, not the human, determines how to implement those instructions. 88 Fed. Reg. at 16193 n.30. The human may influence the direction, but the machine executes the creative choices that constitute copyrightable expression.
Sufficient Human Creativity in AI-Assisted Works
The threshold for finding sufficient human creativity in works involving AI depends on the nature and extent of human contribution. The Copyright Office identifies several scenarios where human creativity can support copyright claims in works containing AI-generated material.
Selection and arrangement represent the most straightforward path to protection. When a human selects AI-generated elements and arranges them with sufficient creativity, the resulting compilation may qualify for copyright. See 17 U.S.C. § 101 (2018) (defining “compilation”). However, this protection extends only to the selection and arrangement, not to the underlying AI-generated elements. The creative bar for compilation copyright remains low but real—the Supreme Court in Feist rejected copyright for alphabetical phone listings because such arrangement lacked even minimal creativity. 499 U.S. at 362-63.
Modification and transformation offer another avenue for protection. When humans modify AI-generated material substantially enough that the modifications themselves meet copyright’s originality standard, those human contributions receive protection. See 17 U.S.C. § 101 (2018) (defining “derivative work”). This parallels traditional derivative work doctrine, where copyright in a derivative work extends only to the material contributed by the derivative author, not to the preexisting material. 17 U.S.C. § 103(b) (2018).
Creative control over AI tools presents evolving challenges. The Copyright Office suggests that future AI systems allowing greater human control over expressive elements might support stronger copyright claims. 88 Fed. Reg. at 16193. However, current generative AI systems, which determine their own expressive outputs based on statistical patterns, provide insufficient human control over the traditional elements of authorship.
Implications for Creative Industries
Traditional creative professionals—writers, artists, musicians—maintain clear copyright in their human-authored works. Their creative choices, from word selection to brushstrokes to musical arrangements, embody the human judgment copyright law protects.
AI-assisted creation occupies a middle ground requiring careful navigation. Creators using AI as a tool while maintaining creative control over expression can secure copyright protection for their human contributions. However, those relying primarily on AI to generate expression may find their outputs unprotectable, regardless of the outputs’ aesthetic or commercial value.
This framework incentivizes maintaining human involvement in creative processes. Rather than fully automating content creation, businesses must ensure sufficient human creative control to secure copyright protection for commercially valuable works. This may involve humans making specific expressive choices, creating original elements to combine with AI outputs, or transforming AI-generated material through creative modification.
The creativity requirement also affects AI training practices. When AI systems train on human-created works, they absorb patterns of human creativity—stylistic choices, narrative structures, visual compositions—that represent centuries of human cultural development. The question of whether using these embodiments of human creativity to train machines constitutes fair use returns us to the fundamental tension between technological innovation and protecting human creative labor.
Licensing Landscape
Current Market Activity
Voluntary licensing is expanding rapidly. The Copyright Office’s May 2025 report documents substantial growth, noting that “voluntary licensing of copyrighted works for use in AI training is increasingly taking place.” 2025 AI TRAINING REPORT, supra, at 85. Recent deals include publishers licensing archives to AI companies for millions of dollars, stock photo companies creating AI-specific licenses, music organizations developing collective licensing frameworks, and news outlets negotiating content access agreements.
Documented deals demonstrate the scale of this emerging market. News Corp-OpenAI is valued at approximately $250 million over five years, Shutterstock reported $104 million in AI licensing revenue in 2023, Taylor & Francis-Microsoft involves $10 million upfront plus recurring fees, and Wiley disclosed $23 million in AI licensing revenue in its March 7, 2024 announcement. These agreements demonstrate that licensing is feasible for certain types of content, particularly from organized industries with clear ownership. Major publishers like Wiley and Taylor & Francis have entered AI licensing agreements worth tens of millions of dollars. Shutterstock reported over $100 million in AI-licensing revenue in 2023. OpenAI has signed deals with major news organizations including the Associated Press, Financial Times, and others.
Challenges and Limitations
Several factors complicate comprehensive licensing. Models may require billions of works, making individual negotiations impractical. Many online works lack clear ownership information. For some content, licensing costs may exceed the value to AI developers. Not all sectors have developed licensing infrastructure.
The Copyright Office acknowledges these challenges, noting that “it is also unclear that markets are emerging or will emerge for all kinds of works at the scale required for all kinds of models.” Id. at 70. Transaction costs pose particular challenges when works are created outside of professional creative industries or are not intended to be monetized, or when ownership is diffuse. For instance, “vernacular works”—content created and posted online by members of the public without the expectation of monetization—may be particularly difficult to license. These may include social media posts, individual blogs or user comments or reviews, or personal photographs or videos.
The Office concludes that “where licensing markets are available to meet AI training needs, unlicensed uses will be disfavored under the fourth factor. But if barriers to licensing prove insurmountable for parties’ uses of some types of works, there will be no functioning market to harm and the fourth factor may favor fair use.” Id. at 71.
Collective Licensing Benefits
Collective management organizations can reduce transaction costs by aggregating rights from multiple owners. This approach works well in music, where ASCAP and BMI already license millions of works. See Broadcast Music, Inc. v. Columbia Broadcasting System, Inc., 441 U.S. 1, 20-24 (1979). Similar organizations are emerging for text, images, and other content types.
The Copyright Office notes “strong interest among those representing copyright owners and creators in developing voluntary collective licensing for the AI context,” observing that collective licensing “can play a significant role in facilitating AI training, reducing what might otherwise be thousands or even millions of transactions to a manageable number.” 2025 AI TRAINING REPORT, supra, at 104. The Office emphasizes that “the aggregation of rights could be mutually beneficial, such as where transaction costs might otherwise exceed the value of using a work or where copyright owners might be difficult to find.” Id.
Antitrust concerns about collective licensing appear manageable if direct licensing remains available as an alternative. The Office encourages the Department of Justice to provide guidance, including on the potential benefit of an antitrust exemption in this context. 2025 AI TRAINING REPORT, supra, at 104.
Recent Litigation Developments
Summary Judgment Decisions Shape Fair Use Analysis
The June 25, 2025 ruling in Kadrey v. Meta Platforms, Inc., No. 23-cv-03417-VC, 2025 WL 1752484 (N.D. Cal. June 25, 2025), and Judge Alsup’s June 2025 decision in Bartz v. Anthropic represent pivotal moments in AI copyright litigation, providing the first substantive judicial analysis of fair use defenses for AI training. These decisions reveal both judicial agreement on certain issues and fundamental disagreements that will require appellate resolution.
Judge Chhabria opened his Kadrey opinion with a stark assessment of the legal landscape, stating that although “the devil is in the details, in most cases the answer will likely be yes” when asked whether using copyrighted materials to train AI without permission is illegal. He emphasized that copyright law’s primary concern is “preserving the incentive for human beings to create artistic and scientific works” and warned that generative AI has “the potential to flood the market with endless amounts of images, songs, articles, books, and more” using “a tiny fraction of the time and creativity that would otherwise be required.” Kadrey v. Meta Platforms, Inc., 2025 WL 1752484, at *1-2 (N.D. Cal. June 25, 2025).
In Bartz v. Anthropic, Judge Alsup granted partial summary judgment to Anthropic on fair use grounds for training uses but denied the motion regarding pirated data claims. His ruling drew a sharp distinction between legitimate training uses and the downloading and storage of pirated materials, which he characterized as “inherently, irredeemably infringing.” Judge Alsup found against Anthropic on all four fair use factors for the pirated copies, suggesting that developers should have purchased legitimate copies of the works they used for training. Bartz v. Anthropic PBC, 2025 WL 1741691, at *15-16 (N.D. Cal. June 23, 2025).
In Kadrey, Judge Chhabria granted summary judgment to Meta on training data issues, though his reasoning differed significantly from Judge Alsup’s approach. While finding that Meta’s training uses were likely fair use, he based his ruling on the plaintiffs’ failure to present evidence of market harm rather than a wholesale endorsement of the fair use defense. His extensive discussion of market dilution theory—suggesting that AI-generated outputs could flood creative markets and disincentivize human creation—represents a novel approach to the fourth factor analysis that could reshape future litigation.
The divergent treatment of pirated materials highlights a fundamental disagreement about the role of good faith in fair use analysis. Judge Alsup condemned using pirated works as incompatible with fair use, stating that “piracy of otherwise available copies is inherently, irredeemably infringing even if the pirated copies are immediately used for the transformative use and immediately discarded.” Bartz v. Anthropic PBC, 2025 WL 1741691, at *19 (N.D. Cal. June 23, 2025). Judge Chhabria took a more nuanced approach, acknowledging that Meta’s use of shadow libraries was “relevant—or at least potentially relevant—in a few different ways,” but ultimately finding it did not defeat fair use given the rest of the summary judgment record. Id. at *19.
Both judges emphasized the importance of technical safeguards in their analyses. Judge Chhabria specifically credited Meta’s implementation of output filters that prevent regurgitation of substantial expression from training data, noting that even using adversarial prompting designed to force regurgitation, experts could extract no more than 50 words from any of the plaintiffs’ books. Id. at *12. This judicial recognition of technical measures aligns with the Copyright Office’s guidance and suggests that developers who implement strong safeguards may strengthen their fair use positions.
DMCA Claims Cannot Proceed Where Use Is Fair
Judge Chhabria’s June 27, 2025 ruling in Kadrey v. Meta on the DMCA claim establishes an important principle: DMCA claims for removal of copyright management information cannot succeed where the underlying use constitutes fair use. The court granted Meta’s motion for partial summary judgment on the plaintiffs’ DMCA claim under 17 U.S.C. § 1202(b)(1), finding that because Meta’s copying was fair use as a matter of law, its removal of CMI could not violate the DMCA.
The court’s analysis provides crucial guidance for AI developers who must process copyrighted materials during training. Since Section 1202(b)(1) requires that CMI removal “induce, enable, facilitate, or conceal an infringement,” and fair use is “not an infringement of copyright” under Section 107, there can be no DMCA violation where the underlying use is fair. This interpretation aligns with the overall structure and purpose of copyright law—Congress would not have intended to exempt fair users from infringement liability only to subject them to DMCA liability for removing metadata during legitimate uses.
The court also noted the criminal liability implications under Section 1204(a), finding it “inconceivable that criminal liability would attach to an act that was done in furtherance of a noninfringing fair use.” This reasoning provides additional protection for developers engaged in legitimate AI training activities.
Publishers v. OpenAI/Microsoft (S.D.N.Y.)
On April 4, 2025, the S.D.N.Y. denied in part defendants’ motions to dismiss, allowing core copyright claims to proceed. See The New York Times Co. v. OpenAI, Inc., No. 1:23-cv-11195-SHS (S.D.N.Y. Apr. 4, 2025). On May 13, 2025, Magistrate Judge Wang ordered OpenAI to preserve and segregate output logs going forward, with the order maintained over objection. The preservation order requiring OpenAI to retain ChatGPT output log data has become central to discovery disputes, with negotiations over scope continuing as parties work through the practical implications of data preservation requirements.
Visual Artists v. Stability AI
In Andersen v. Stability AI, the court allowed certain copyright claims to proceed and accepted as plausible plaintiffs’ allegation that the Stable Diffusion model contains compressed representations that can enable reproduction of training images. The court did not hold that model weights are per se infringing copies. Andersen v. Stability AI, No. 3:23-cv-00201-WHO, ECF 223 (N.D. Cal. Aug. 12, 2024). Discovery is ongoing, with focus on technical capabilities and actual practices.
The Bartz Class Certification Decision
On July 17, 2025, Judge Alsup issued a significant order on class certification in Bartz v. Anthropic PBC, No. C 24-05417 WHA, 2025 WL 5678901 (N.D. Cal. July 17, 2025) (order on class certification). The court certified a class limited to actual or beneficial owners of timely registered copyrights in ISBN/ASIN-bearing books downloaded by Anthropic from LibGen and PiLiMi pirate libraries.
Judge Alsup characterized the case as exemplifying “the classic litigation that should be certified as a representative action,” noting that “the entire class stands aggrieved by defendant’s downloading of their books from pirate libraries on the internet.” The court emphasized that “it will be straightforward to prove the classwide wrong done” through “Napster-style downloading of millions of works.”
The court’s analysis provides important guidance on copyright ownership in the AI training context. The class definition includes both legal and beneficial owners of the exclusive right to reproduce copies under Section 106(1). The court explained that beneficial owners include authors who receive royalties from publishers’ revenues or recoveries from the right to make copies, noting that “the author has a definite stake in the royalties, so the author has standing to sue.”
Regarding class management, the court established comprehensive notice procedures requiring notice by first-class mail and email to authors, publishers, and copyright owners listed on copyright certificates, as well as publication in trade journals. The court also required class claimants to serve notice on all others associated with the book to prevent competing claims.
The court rejected Anthropic’s arguments about individualized ownership inquiries, finding that “in the district judge’s experience and judgment, very few disputes over ownership will arise” because “authors and their publishers have ongoing business relationships and they will work out whatever differences (if any) they have over how to divide the recovery.”
Significantly, the court denied certification for a Books3 Pirated Books Class, finding that the sparse metadata and spotty content made identification too problematic. The court also denied certification for a Scanned Books Class, noting that “the path to recovery for scanning purchased print books and storing their digital replacements peters out” under fair use analysis.
Recommendations
Near-Term Actions
AI developers should implement strong technical safeguards against outputting copyrighted content. The June 2025 DMCA ruling confirms that fair use protections extend to technical preprocessing, but only for uses that actually qualify as fair use.
The Copyright Office recommends “allowing the licensing market to continue to develop without government intervention.” 2025 AI TRAINING REPORT, supra, at 106. The rapid growth of licensing agreements suggests that market solutions are emerging for many use cases.
AI developers should prioritize several key practices to reduce legal risk while demonstrating good faith engagement with the creative community. They should document their data sources and acquisition methods comprehensively, implement strong technical safeguards against outputting copyrighted content, pursue licenses where available and feasible, respect opt-out signals from copyright owners, and maintain clear records distinguishing human contributions from AI-generated content in their outputs. The March 2025 court decisions demonstrate that technical safeguards can significantly strengthen fair use defenses. The June 2025 DMCA ruling provides additional comfort that fair use protections extend to technical preprocessing steps like CMI removal, though this protection only applies to uses that qualify as fair use.
Copyright owners should develop clear licensing terms for AI training, consider collective licensing to reduce transaction costs, implement machine-readable rights information, and engage constructively with AI developers. Early movers in licensing are already seeing significant revenue—those who delay risk being left behind as standards solidify.
Users of AI systems must understand that purely AI-generated content lacks copyright protection. The Copyright Office’s 2025 guidance makes clear that prompts alone, regardless of complexity or iteration, do not establish authorship. Organizations should implement policies ensuring sufficient human creative control over AI-generated materials intended for commercial use, properly document human contributions to mixed human-AI works, and consider the implications of using unprotectable AI content in their business strategies.
Potential Future Interventions
If market failures persist for specific content types, targeted solutions may be needed. The Copyright Office concludes that “if market failures are shown as to specific types of works in specific contexts, targeted intervention such as ECL should be considered.” 2025 AI TRAINING REPORT, supra, at 106.
Extended collective licensing would allow authorized organizations to license entire categories of works with opt-out rights for owners. This approach is less intrusive than compulsory licensing while achieving broad coverage. The Office notes that ECL “would permit copyright owners to choose to license separately, while enabling full coverage of the entire sector for AI training.” Id. at 105.
Judge Chhabria’s suggestion in Kadrey that the generative AI industry would find ways to compensate copyright owners signals potential judicial receptiveness to monetary remedies rather than injunctive relief. He noted that licensing markets would likely emerge “if LLM developers’ only choices are to get licenses or forgo the use of copyrighted books as training data,” suggesting courts may push parties toward negotiated solutions.
Statutory clarifications could be considered if courts struggle with consistent application of existing law. Congress could clarify how fair use applies to specific AI training scenarios. However, the Copyright Office’s 2025 reports conclude that legislation is unnecessary at this point, as existing legal doctrines adequately resolve copyrightability questions. 2025 AI COPYRIGHTABILITY REPORT, supra, at 40; 2025 AI TRAINING REPORT, supra, at 107.
International coordination represents a crucial long-term consideration. The United States should work toward greater harmonization of AI training rules and copyright standards for AI-generated content to reduce compliance complexity and ensure consistent protection standards.
Digital Replica Protection Recommendations
The Copyright Office concludes that new federal legislation is urgently needed to address unauthorized digital replicas. The Office recommends that Congress establish a federal right that protects all individuals during their lifetimes from the knowing distribution of unauthorized digital replicas. The right should be licensable, subject to guardrails, but not assignable, with effective remedies including monetary damages and injunctive relief. Traditional rules of secondary liability should apply, but with an appropriately conditioned safe harbor for OSPs. The law should contain explicit First Amendment accommodations. Finally, in recognition of well-developed state rights of publicity, the Office recommends against full preemption of state laws. 2024 DIGITAL REPLICAS REPORT, supra, at 57.
Conclusion
The intersection of artificial intelligence and copyright law presents both unprecedented challenges and opportunities for the American legal system. As this comprehensive analysis has demonstrated, existing copyright frameworks are being tested by AI technologies that can both learn from and generate creative content at scales and speeds previously unimaginable.
The U.S. Copyright Office’s trilogy of reports between July 2024 and May 2025 provides essential guidance for navigating these uncharted waters. The Office’s clear stance on human authorship requirements, its nuanced approach to fair use in AI training, and its urgent call for digital replica protection legislation collectively represent a thoughtful attempt to balance innovation with creator rights. Recent federal court decisions, particularly the divergent approaches taken by Judges Alsup and Chhabria in the Bartz and Kadrey cases, illustrate that judicial interpretation of these issues remains in flux and will likely require appellate and potentially Supreme Court resolution.
The constitutional foundations examined in this article remind us that copyright law serves a specific purpose: to promote the progress of science and useful arts by incentivizing human creativity. As courts and policymakers continue to grapple with AI’s implications, this fundamental purpose must remain the lodestar guiding legal development. The emerging licensing markets demonstrate that voluntary, market-based solutions can address many concerns, though targeted interventions may be necessary where market failures persist.
Looking forward, several critical questions remain unresolved. The scope of fair use for AI training, the extent of protection for human contributions to AI-assisted works, and the international harmonization of AI-related copyright rules will continue to evolve. What is clear, however, is that the legal framework must adapt thoughtfully and deliberately, neither stifling technological innovation nor abandoning the protections that have long encouraged human creative expression.
As artificial intelligence continues to transform creative industries, the law must strike a careful balance—one that recognizes both the transformative potential of AI technology and the irreplaceable value of human creativity. The path forward requires continued dialogue among technologists, creators, legal scholars, and policymakers to ensure that copyright law continues to serve its constitutional purpose in the digital age.
Citation
BibTeX
@article{agustin2025ai,
title={AI and Copyright Law: U.S. Framework for Training, Copyrightability, and Digital Replicas},
author={Agustin, Jonathan},
journal={Hugging Face Community Articles},
year={2025},
month={August},
day={9},
url={https://huggingface.co/blog/ai-copyright-analysis-2025},
}
Bluebook
Jonathan Agustin, AI and Copyright Law: U.S. Framework for Training, Copyrightability, and Digital Replicas, HUGGING FACE COMMUNITY ARTICLES (Aug. 9, 2025), https://huggingface.co/blog/ai-copyright-analysis-2025.