The Internet Archive’s Open Library Faces Copyright Challenge

By Shane Wax

A group of publishers, made up of Hatchette, HarperCollins, John Wiley & Sons and Penguin Random House, have sued the Internet Archive for infringement in its “Open Library”of the copyrights of over 127 books by C.S. Lewis, J.D. Salinger, Lemony Snicket and many others. [i]  The lawsuit was spurred by the Internet Archive’s pandemic-borne “emergency” plan to remove longstanding restrictions on access to digitally reproduced books, but even a premature end to that program does not appear to have ended the legal dispute, which could test the bounds of “fair use” under Section 107 of the Copyright Act, as well as the definition of and scope of protection afforded to digital libraries and archives under Section 108 of the Act.

Background

The Internet Archive is a well-known online library, reference tool and non-profit whose mission “is to provide universal access to all knowledge.”[ii] It grew from archiving Internet webpages, through its trademarked “Wayback Machine,” and expanded to include digitized paper texts, images, audio, and video. Today, the public can access its archive containing over 300 billion web pages, 20 million books, 1.6 million TV news programs, 400,00 U.S. patents, 180,000 live concerts, 20,000 computer games, and still more, including many old and rare works thought lost forever.[iii]  Inherently, the acts of digitizing, reproducing, displaying and distributing works raise issues of copyright law, for which the Internet Archive recently found itself in a “boat of hot water.”

In the case of books, the Internet Archive strives to function like a traditional library, albeit digital: books may be “borrowed” only by users with registered accounts (e.g., a library card) depending on the availability of the specific book. Just like a traditional library, only one user can access a copy of a work at a time.  If the Internet Archive has only one copy, users are restricted to reading the work in their Internet browser for up to one hour. If multiple copies are available, a copy can be “checked out” for up two weeks, and even downloaded by the user as an encrypted file with access automatically disabled after 14 days. If each of multiples copies has already been “checked out,” a user must join a waitlist before accessing the work.   This practice has become known as “Controlled Digital Lending” or “CDL,” and has gained support from legal scholars, academics and library-affiliated groups, and is used by other institutions including UCLA Library and Georgetown Law Library.[iv]

Of course, in order for the Archive to amass such a large volume of works and make them available online necessarily means the Internet Archive is engaged in copying and distributing copyrighted works.  However, by acting as a traditional library, the Internet Archive was attempting to comply with Section 108 of the Copyright Act, which effectively creates a safe harbor for libraries and archives to reproduce and distribute “no more than one copy or phonorecord of a work” without running afoul of copyright laws.

However, with thousands of researchers and students stuck at home during the covid-19 pandemic, the Internet Archive sought to bring the books to them by expanding its Open Library.  In March, the Internet Archive launched a “National Emergency Library,” offering free and unlimited access to its trove of over 1 million books, including many works still subject to copyright law.  Naturally, publishers took notice of this “emergency” program and did not like it; after all, if anyone and everyone can download a free copy of “Catcher in the Rye,” who is going to pay for a physical copy or a licensed e-book?

So, in early June, the publisher sued, and within two weeks of the lawsuit being filed, the Internet Archive had ended its National Emergency Library program.  However, the lawsuit continues.  Indeed the lawsuit claims that the Internet Archive’s entire Open Library program, even subject to CDL, runs afoul of the Copyright Act.  The Complaint asserts that CDL is an “invented theory” with no basis in the Copyright Act, and argues that the Internet Archive “merely exploits the investments that publishers have made in their books . . . pays for none of the expenses that go into publishing a book and is nothing more than a mass copier and distributor of bootleg works.”[v]

The Complaint further asserts that “[d]igital books are inherently different from physical books . . . [and] Publishers have established independent and distinct distribution models for e-books, including a market for lending e-books through libraries, which are governed by different terms and expectations than print books,” and as such, the Internet Archive Open Library program directly competes with and replaces this e-book model.[vi]  Independently, the New York Times has noted that the Internet Archive “operates differently from public libraries that have e-book lending programs” in that “[p]ublic libraries get licenses from publishers for the e-books they lend, and publishers receive payments, according to the terms that are set,” while the Internet Archive has no comparable license agreements.[vii]

And while the lawsuit only focuses on 127 books, a decision in favor of the publishers could have broader impacts on the Internet Archive, which has preserved thousands of other works, including books and software, that are still subject to copyright laws, but which are out of print, practically obsolete, or otherwise unavailable to the public.

The Internet Archive has not yet formally interposed any defenses, but if this case proceeds to a decision on the merits, it may consider two statutory defenses.  First, certain acts of reproduction and distribution by libraries and archives are statutorily excluded from the definition of “infringement” provided the institution satisfies the requirements of Section 108 of the Copyright Act; if this “super defense” fails, the Internet Archive will likely argue “fair use” under Section 107.[viii]

Does The Internet Archive Qualify for Library & Archive Protections?

Under Section 108 of the Copyright Act, “it is not an infringement of copyright for a library or archives . . . to reproduce no more than one copy or phonorecord of a work,” for preservation purposes provided that “(1) the reproduction or distribution is made without any purpose of direct or indirect commercial advantage,” (2) the institution’s collection is available to the public and not just the institution’s affiliates, and (3) the reproduction includes a copyright notice referencing Section 108.

Additionally, Section 108 authorizes a library or archive to make copies for an individual user’s private research, although the restrictions on that exception are narrower, including a requirement that the institution first search the market for additional available copies before engaging in reproduction itself and a prohibition on distributing digitized copies of works not already available in digital format outside of the institution’s physical premises.[ix]  Likewise, while the statute provides a safe harbor for “isolated and unrelated reproduction or distribution of a single copy or phonorecord of the same material on separate occasions,” it does not cover “systematic reproduction or distribution of single or multiple copies.”

Importantly, a library or archive is permitted to lend out its preservation copy or a copy created at the request of a user for private research without fear of running afoul of the right of distribution.

In theory, by creating a single digitized copy of a physical work already in its possession, the Internet Archive is complying with the statute as long as it is including the required copyright notice.  However, the question is not as straightforward because of the way that the Internet Archive’s Open Library works: since a requesting user downloads their own encrypted, time-limited copy of the digitized book, these additional, arguably systematic copies may disqualify the Internet Archive from relying on this super defense.  On the other hand, the fact that the Complaint asserts that Open Library competes with the publisher’s distribution of e-books probably means the prohibition against wide distribution of digitized copies will not be an issue in this case.

Does The Internet Archive Have a Fair Use Defense?

Whether conduct that would otherwise qualify as copyright infringement is instead considered “fair use,” turns primarily on four statutory factors, the chief question nowadays being whether the use in question is “transformative.”[x]  There have already been a handful of cases addressing fair use in making digital copies of copyrighted physical works arising out of cases brought by the Authors Guild in the same jurisdiction as the current suit, and these precedents will likely set the stage.

First, in a 2014 case, Authors Guild v. Hathitrust, the Second Circuit Court of Appeals upheld a finding of fair use where the defendants made digital reproductions of copyrighted literary works “for the purpose of permitting full-text searches” and “to facilitate access for print-disabled persons” who could not otherwise appreciate the knowledge and expression contained in physical copies of the works.[xi]

Then in a 2015 case with the same plaintiff, Authors Guild, Inc. v. Google, Inc.,[xii] the Second Circuit upheld a finding of fair use in a lawsuit against Google’s Library Project and Google Books project in a decision authored by Judge Level, who, as a district court judge, wrote a 1990 law review article “Toward a Fair Use Standard” that effectively brought about the modern focus on transformativeness.[xiii]  Now Circuit Judge Level concluded that “Google’s making of a digital copy to provide a search function is a transformative use, which augments public knowledge by making available information about Plaintiffs’ books without providing the public with a substantial substitute for matter protected by the Plaintiffs’ copyright interests in the original works or derivatives of them.”[xiv]

However, in 2018, in Fox News Network, LLC v. TVEyes, Inc., the Second Circuit rejected a fair use defense while considering TVEye’s “Watch Function,” which enables users to “view the Fox programming they want at a time and place that is convenient to them, rather than at the time and place of broadcast,” allowing the user to watch focus on a single relevant issue without having to monitor an entire length of the program for that discussion. The Court held that “TVEyes’s Watch function is at least somewhat transformative in that it renders convenient and efficient access to a subset of content; however, because the function does little if anything to change the content itself or the purpose for which the content is used, its transformative character is modest at best,” and was ultimately outweighed by the commercial aspect of TVEyes’s use.[xv]

A court may very well find that the digitization and distribution of books by Internet Archive has little transformative character because it “does little if anything to change the content itself or the purpose for which the content is used” akin to that in TVEyes.  On the other hand, contrary to the Complaint’s allegations, the Internet Archive is a non-profit that “has no revenues flowing directly from its operation” of its Open Library, and whose commercial profit motivation is more akin to Google’s.  That the Internet Archive reproduces entire works, not just snippets, and competes with the publisher’s e-book market, also probably weigh against a finding of fair use.

Ultimately, a court will have to balance any transformativeness and utilitarian motivation against the sheer quantity and quality of copying and competitive advantages.  This question may ultimately determine the Internet Archive’s liability, but it is unlikely to be resolved quickly.

[i] Hachette Book Group, Inc. v. Internet Archive, Case No. 20-cv-04160 (S.D.N.Y.).

[ii] About the Internet Archive (last accessed July 1, 2020), https://archive.org/about/.

[iii] See Id.; Alexis Ong, A lawsuit against the Internet Archive threatens vital gaming history, PCGamer, June 24, 2020, https://www.pcgamer.com/a-lawsuit-against-the-internet-archive-threatens-vital-gaming-history/.

[iv] See Matt Enis, Controlled Digital Lending Concept Gains Ground, Library Journal, Nov. 14, 2018, libraryjournal.com/?detailStory=181115ControlledDigitalLending; Jessica Aiwuyor, While Library Buildings Are Closed, Collaborative Digital Library Ensures Access to Books for Research, Study, Ass’n of Research Libraries, June 2, 2020, https://www.arl.org/news/while-library-buildings-are-closed-collaborative-digital-library-ensures-access-to-books-for-research-study/; Press Release, Arian Bicho, UCLA Library Secures Access to Digitized Versions of UC-held Books, UCLA Library, Apr. 9, 2020, https://www.library.ucla.edu/news/ucla-library-secures-access-digitized-versions-uc-held-books; White Paper, David R. Hansen & Kyle K. Courtney, A White Paper on Controlled Digital Lending of Library Books (2018), available at https://controlleddigitallending.org/whitepaper.  In fact, a number of higher learning institutions (including Cornell University’ and MIT’s libraries) and scholars have even endorsed the Internet Archive’s National Emergency Program. See Public Statement: Supporting Waitlist Suspension for Books Loaned by the Internet Archive During the US National Emergency, Mar. 24, 2020, https://docs.google.com/document/u/1/d/e/2PACX-1vQeYK7dKWH7Qqw9wLVnmEo1ZktykuULBq15j 7L2gPCXSL3zem4WZO4JFyj-dS9yVK6BTnu7T1UAluOl/pub. Of course, not everyone agrees that Internet Archive is in the right, including the Authors Guild and Senator Tillis, who chairs the Senate Subcommittee on Intellectual Property and has accused Internet Archive of engaging in copyright infringement. See Letter from Sen. Tillis to Brewster Kahle, Apr. 8, 2020, available at https://www.publishersweekly.com/binary-data/ARTICLE_ATTACHMENT/file/000/004/4365-1.pdf (last accessed July 15, 2020); Op-Ed, The Internet Archive’s noble mission, Pittsburgh Post-Gazette, June 24, 2020, https://www.post-gazette.com/opinion/editorials/2020/06/24/Internet-Archive-copyright-books-archive-records-music-Thom-Tillis/stories/202006200006; Aja Romano, A lawsuit is threatening the Internet Archive — but it’s not as dire as you may have heard, Vox, June 23, 2020, https://www.vox.com/2020/6/23/21293875/internet-archive-website-lawsuit-open-library-wayback-machine-controversy-copyright; Mike Masnick, Authors Guild Attacks Libraries For Lending Digital Books, techdirt, Jan. 31, 2019, https://www.techdirt.com/articles/20190128/17205941481/authors-guild-attacks-libraries-lending-digital-books.shtml.

[v] Hatchette, supra n.1, Document No. 1 ⁋ 9.

[vi] Id., ⁋ 10.

[vii] Alexandra Alter, ‘Emergency’ Online Library Draws Ire of Some Authors, N.Y. Times, Mar. 30, 2020, https://www.nytimes.com/2020/03/30/books/internet-archive-emergency-library.html.

[viii] A Section 108 defense will be a threshold issue since it can be answered independently from the question of infringement, whereas a court would not reach the affirmative defense of “fair use” until the plaintiff has already prima facie proven infringement. See Authors Guild, 804 F.3d at 213 (citing Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 577-78 (1994)); Ned Snow, The Forgotten Right of Fair Use, 62 Case W. Res. L. Rev. 135, 169 n.200 (2011) (collecting cases).  Put differently, Section 108 covers protected uses, while Section 107 covers permissive uses.

[1] Hachette Book Group, Inc. v. Internet Archive, Case No. 20-cv-04160 (S.D.N.Y.).

[1] About the Internet Archive (last accessed July 1, 2020), https://archive.org/about/.

[1] See Id.; Alexis Ong, A lawsuit against the Internet Archive threatens vital gaming history, PCGamer, June 24, 2020, https://www.pcgamer.com/a-lawsuit-against-the-internet-archive-threatens-vital-gaming-history/.

[1] See Matt Enis, Controlled Digital Lending Concept Gains Ground, Library Journal, Nov. 14, 2018, libraryjournal.com/?detailStory=181115ControlledDigitalLending; Jessica Aiwuyor, While Library Buildings Are Closed, Collaborative Digital Library Ensures Access to Books for Research, Study, Ass’n of Research Libraries, June 2, 2020, https://www.arl.org/news/while-library-buildings-are-closed-collaborative-digital-library-ensures-access-to-books-for-research-study/; Press Release, Arian Bicho, UCLA Library Secures Access to Digitized Versions of UC-held Books, UCLA Library, Apr. 9, 2020, https://www.library.ucla.edu/news/ucla-library-secures-access-digitized-versions-uc-held-books; White Paper, David R. Hansen & Kyle K. Courtney, A White Paper on Controlled Digital Lending of Library Books (2018), available at https://controlleddigitallending.org/whitepaper.  In fact, a number of higher learning institutions (including Cornell University’ and MIT’s libraries) and scholars have even endorsed the Internet Archive’s National Emergency Program. See Public Statement: Supporting Waitlist Suspension for Books Loaned by the Internet Archive During the US National Emergency, Mar. 24, 2020, https://docs.google.com/document/u/1/d/e/2PACX-1vQeYK7dKWH7Qqw9wLVnmEo1ZktykuULBq15j 7L2gPCXSL3zem4WZO4JFyj-dS9yVK6BTnu7T1UAluOl/pub. Of course, not everyone agrees that Internet Archive is in the right, including the Authors Guild and Senator Tillis, who chairs the Senate Subcommittee on Intellectual Property and has accused Internet Archive of engaging in copyright infringement. See Letter from Sen. Tillis to Brewster Kahle, Apr. 8, 2020, available at https://www.publishersweekly.com/binary-data/ARTICLE_ATTACHMENT/file/000/004/4365-1.pdf (last accessed July 15, 2020); Op-Ed, The Internet Archive’s noble mission, Pittsburgh Post-Gazette, June 24, 2020, https://www.post-gazette.com/opinion/editorials/2020/06/24/Internet-Archive-copyright-books-archive-records-music-Thom-Tillis/stories/202006200006; Aja Romano, A lawsuit is threatening the Internet Archive — but it’s not as dire as you may have heard, Vox, June 23, 2020, https://www.vox.com/2020/6/23/21293875/internet-archive-website-lawsuit-open-library-wayback-machine-controversy-copyright; Mike Masnick, Authors Guild Attacks Libraries For Lending Digital Books, techdirt, Jan. 31, 2019, https://www.techdirt.com/articles/20190128/17205941481/authors-guild-attacks-libraries-lending-digital-books.shtml.

[1] Hatchette, supra n.1, Document No. 1 ⁋ 9.

[1] Id., ⁋ 10.

[1] Alexandra Alter, ‘Emergency’ Online Library Draws Ire of Some Authors, N.Y. Times, Mar. 30, 2020, https://www.nytimes.com/2020/03/30/books/internet-archive-emergency-library.html.

[1] A Section 108 defense will be a threshold issue since it can be answered independently from the question of infringement, whereas a court would not reach the affirmative defense of “fair use” until the plaintiff has already prima facie proven infringement. See Authors Guild, 804 F.3d at 213 (citing Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 577-78 (1994)); Ned Snow, The Forgotten Right of Fair Use, 62 Case W. Res. L. Rev. 135, 169 n.200 (2011) (collecting cases).  Put differently, Section 108 covers protected uses, while Section 107 covers permissive uses.

[1] 17 U.S.C. § 108.

[1] 17 U.S.C. § 107; Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994).

[1] Authors Guild, Inc. v. Hathitrust, 755 F.3d 87 (2d Cir. 2014).

[1] Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015).

[1] Pierre N. Leval, Toward a Fair Use Standard, 103 Harv. L. Rev. 1105 (1990); See Campbell, 510 U.S. at 576-78 (citing same); Am. Geophysical Union v. Texaco Inc., 60 F.3d 913, 921 (2d Cir. 1994) (noting reduced emphasis on commercial motivation in fair use analysis).

[1] Authors Guild v. Google, Inc., 804 F.3d at 207.

[1] Fox News Network, LLC v. TVEyes, Inc., 883 F.3d 169, 180-81 (2d Cir. 2018).