Using hashes and scanning to stop cloud storage from being used for infringement (2014)


Since the rise of the internet, the recording industry has been particularly concerned about how the internet can and will be used to share infringing content. Over time, the focus of that concern has shifted as the technology (as well as copyright laws) have shifted. In the early 2000s, most of the concern was around file sharing applications, services and sites, such as Napster, Limewire, and The Pirate Bay. However, after 2010, much of the emphasis switched to so-called “cyberlockers.”

Dropbox DMCA notice

Unlike file sharing apps, that involved person-to-person sharing directly from their own computers via intermediary technologies, a cyberlocker was more of a hard drive on the internet. The issue was that some would store large quantities of music files, and then make them available for unlicensed downloading.

While some cyberlockers were built directly around this use-case, at the same time, cloud storage companies were trying to build legitimate businesses, allowing consumers and businesses to store their own files in the cloud, rather than on their own hard drive. However, technologically, there is little to distinguish a cloud storage service from a cyberlocker, and as the entertainment industry became more vocal about the issue, some services started to change their policies.

Dropbox is one of the most well-known cloud storage companies. Wishing to avoid facing comparisons to cyberlockers built off of the sharing of infringing works, the company put in place a system to make it more difficult to use the service for sharing works in an infringing manner, while still allowing the service to be useful for storing personal files.

Specifically, if Dropbox received a DMCA takedown notice for a specific file, the company would create a hash (a computer generated identifier that would be the same for all identical files), and then if you shared any file from your Dropbox to someone else (such as by creating a shareable link), Dropbox would create a hash and check it against the database of hashes of files that had previously received DMCA takedown notices.

This got some attention in 2014 when a user on Twitter highlighted that he had been blocked from sharing a file because of this, raising concerns that Dropbox was looking at everyone’s files.

Dropbox quickly clarified that it is not scanning every file, nor was it looking at everyone’s files. Rather it was using an automated process to check files that were being shared and see if they matched files that had previously been subject to a DMCA takedown notice:

“There have been some questions around how we handle copyright notices. We sometimes receive DMCA notices to remove links on copyright grounds. When we receive these, we process them according to the law and disable the identified link. We have an automated system that then prevents other users from sharing the identical material using another Dropbox link. This is done by comparing file hashes. We don’t look at the files in your private folders and are committed to keeping your stuff safe.”

Decisions to be made by Dropbox:

  • How proactive does the company need to be to remain on the compliant side of copyright law?
  • Will blocking sharing of files that might be shared for non-infringing purposes, make the service less useful to users?
  • What steps are necessary to avoid being accused of supporting infringement by traditional copyright industries?

Questions and policy implications to consider:

  • There may be legitimate, non-infringing reasons to share a file that in other contexts may be infringing.
  • Is it appropriate for a company to block that possibility?
  • What measures could be put in place to allow for those possibilities?
  • The recording and movie industries have a history of being aggressive litigants against technologies used for infringement. What level of response is appropriate for new startups and technology companies?
  • Will there be limitations on innovation to services like cloud storage imposed by the need to avoid angering certain industries?

Resolution: Dropbox has continued to use a similar setup, and for the most part has avoided being compared to traditional cyberlockers. Since 2014, the issue of DMCA takedowns leading to future blocking of files has not received all that much attention either. There have been a few articles and forum discussions about how it works, with some users looking for workarounds, but for the most part this technological setup appears to have prevented Dropbox from being considered a cyberlocker-style site for infringing file sharing.

Written by The Copia Institute, November 2020

Copia logo