Sunday, January 27, 2008

A security thought: AT&T Copyright Fighting

The following is just my own opinion and speculation, to a hypothetical question: If I was AT&T, why and how would I implement the AT&T plan to enforce copyright on user traffic. (Note, this post is an extension of my slashdot comment on that thread, and basically describes a "DMCA Takedown on the Network Layer" style of response.)

I also believe this would be a significant problem if implemented. I'm a believer that general network neutrality is a mostly good thing. But when a company seriously proposes filtering, I believe we should attempt to determine what shape such filtering would take, and how it could maximize the stated objectives while minimizing collateral damage. This also gives those opposed to filtering a leg up on attempting to counter it.

To begin with, AT&T probably has a huge incentive to block pirated traffic. Time-Warner cable supposedly has 50% of the bandwidth used by 5% of the users. Who wants to bet that of this bandwidth, it is almost all pirated material and/or pornography? As an ISP, wouldn't you want to remove 1/3rd of your traffic? Especially if its customers that can't really complain about it?

The strength of piracy on the Internet is the ease of getting the pirated material,and the ease of distribution. Thus pirated material must be easy to find if it is to be a substantial portion of traffic and to have a significant economic impact.

So all the MPAA has to do is find the easy-to-find content, and do something about it. Currently, they've tried playing Whak-A-Mole on the Torrent tracking servers, but this has been a losing game, as these servers have already fled to "Countries of convenience", where they are difficult for the MPAA to sue off the network.

But rather than playing Whak-A-Mole on Torrent tracker servers (which are largely offshore), with ISP cooperation from AT&T it becomes possible to play Whak-A-Mole on the torrents themselves. Such a system would benefit both the content owners and the ISPs.

All that is necessary is that the MPAA or their contractor automatically spiders for torrents. When it finds torrents, it connects to each torrent with manipulated clients. The client would first transfer enough content to verify copyright, and then attempt to map the participants in the Torrent.

Now the MPAA has a "map" of the participants, a graph of all clients of a particular stream. Simply send this as an automated message to the ISP saying "This current graph is bad, block it". All the ISP has to do is put in a set of short lived (10 minute) router ACLs which block all pairs that cross its network, killing all traffic for that torrent on the ISP's network. By continuing to spider the Torrent, the MPAA can find new users as they are added and dropped, updating the map to the ISP in near-real-time.

This would be a powerful system, and the likely solution AT&T will use if they carry through on their plans to enforce copyright:

  • This requires no wiretapping. Instead, it relies solely on public information: the torrent servers and being able to contact participants in order to map those fetching an individual file. BitTorrent encryption would have no impact on this scheme.
  • It can be completely automated, both for the MPAA and AT&T
  • It also minimizes collateral damage, since only participants in an individual torrent can't communicate with each other when a Torrent is blocked. If the MPAA actually spiders the torrent (rather then trusting information from the trackers), there should be no false edges in the graph. The only collateral damage is if a pair of systems is also performing legitimate communication at the same time they are participating in the Torrent, something the ISP probably considers acceptable.
  • Any real collateral damage (incorrectly blocking content) AT&T can say is the fault of the MPAA.
  • It should be robust in the arms race: if the pirated material is open and distributed in a P2P manner, the MPAA's spiders should be able to track it. (Remember, even if CAPTCHAs are used to protect trackers or aspects of the systems, solving a CAPTCHA only costs $.01).
  • And its inexpensive. All AT&T has to do is deploy a small program to set and release a bunch of router ACLs, and thats it. AT&T can even keep the number of ACLs reasonably low, because they expire quickly and only need to be partially effective. No new hardware is required and everything can be fully automated. All the real costs (of spidering the Torrents, content identification, affirming that it is actually a copyright violation, and constructing the graphs) is placed on the MPAA or their contractor.

Likewise, (IANAL) AT&T can possibly avoid most liability. They aren't doing any wiretapping, nor even making a decision about which traffic to block.

Finally, AT&T has a huge number of reasons to deploy such a system:

  • It keeps the content providers happy for when they are negotiating their compete-with-iTunes/Netflix video on demand and cable TV services.
  • It keeps the content providers from pushing through very draconian legislation, or at least draconian legislation you aren't happy with. (It can F-up your competitors, but thats just a bonus)
  • And it drops their bandwidth bills by 30-50% by eliminating a large amount of deliberately-noncacheable (both politically and because of bittorrent encryption) traffic.

This won't stop closed-world pirates, those with a significant entry and secrecy, but those are far less significant. Closed-world pirates are much lower bandwidth for the ISP, because its far more difficult for pirates to get the content. But it should be able to shut down Bittorrent for open-world piracy, without blocking legitimate BitTorrent. It also won't stop child porn, although AT&T would probably claim that it does.

This was speculation. I have no evidence that this is what AT&T is planning. But given the huge expense (deep packet inspection), legal implications (wiretapping, false positives) and limitations (cryptography), I find it doubtful that AT&T really wants to detect copyrighted material directly. Performing deep packet inspection at line rates, especially to match a large database of copyrighted material, is hugely expensive, and would fail in the presence of encrypted Torrents and SSL-equipped Torrent search servers.

Thus I'm almost certain that if AT&T truely wishes to carry forward with its copyright-enforcement plants, the system will be similar to the one I've described.

Detecting this if they do deploy copyright enforcement would be possible, by participating in torrents (to generate the block) and then checking how that affects connectivity. If AT&T blocks Torrents but other TCP connectivity in those port ranges remains between two hosts, they aren't using only the speculated system, instead they would have to be directly inspecting the traffic between the hosts to determine that an individual flow is participating, information which can only be obtained by directly monitoring communication between the two hosts.

EDIT/addition: Richard Bennet has also discussed this technique at the Network Neutrality forum on 1/26/2008 (Slides at Richard Bennet's web site, on how easy it is to find pirated materials and participating peers to tell the ISP what to block).

He also brings up the important question: "Is there any reason that such an automated system should not be used, or does Net Neutrality now connote a license to steal?" This is a tough argument to counter.

The ongoing discussion can be viewed at The NNSquad Mailing List archive.

EDIT/addition #2: Delayed release of keys (distribute then release keys, as Richard Clayton pointed out) would slow down any spider, but also slows down users from getting content. The spider could still block all users after the key is released, and as people couldn't tell what they are downloading BEFORE the key is released, the MPAA could produce a large number of poisoned (false data) torrents during this window.


SEoD said...

Interesting post.. you should definitely keep the blog going!

I assume that your plan targets the BT clients' IPs rather than the trackers' IP(s) because trackers can now be distributed.

I don't think it works if the clients are connected via TOR though. That is: your snitch client might detect a TOR exit node's IP instead of the pirate's. As the BT client would probably generate a new TOR path for each connection (not 100% sure this is the case, but I suspect it is), that means it's exposing a lot of TOR nodes to your snitch, which might make TOR unusable (even for valid purposes) if your snitch is very fast at collecting IPs. If it's not fast at collecting IPs, there will be plenty of TOR exit nodes to go around and the effect will be minimal.

The ISP still wouldn't know who the pirate was (excepting attempts to directly attack the anonymity of TOR).

Nicholas Weaver said...

Yes, its "Disrupt the clients" not "disrupt the tracker"

Tor is pretty irrelevant, because

a) Its hard to use, so that it wouldn't get used much.

b) There isn't much bandwidth. If a large number of people did their piracy through Tor, tor would collapse under the traffic load.

c) Most retail ISPs don't want TOR exit nodes running on their network, as that is violating the "no servers" portion of the TOS, but your right, you'd get that collateral damage, hmm, interesting, and not a good thing.

Anonymous said...

The most obviously counter to this system is for trackers to simply compile a list of known RIAA/MPAA IP ranges and simply drop all connections to them. That actually seems pretty easy, and it wouldn't take long to find from the logs who the snitch was and drop the entire /16. That's what I would do.

Other counters include VPN into free countries - such services exist already, and of course a rise in tightly controlled private trackers.

Apart from the technical point, I don't see users taking this too well. Unless they forced it across all ISPs, I could see AT&T losing significant market share from this - even if I had no interest in downloading any pirated content I would instantly leave any ISP that played that kind of game with my connections.

Interesting idea though.

Danny Colligan said...
This comment has been removed by the author.
Danny Colligan said...

Following up on anonymous' comment, the MediaDefender IPs can be found here (scroll down):

Also, I believe an issue is not just that AT&T will do this for their subscribers, but also customers of other ISPs whose packets go through AT&T's system at some point along the route.

(Earlier comment removed because I mistakenly wrote "MediaSentry" instead of "MediaDefender")

k2 said...

Another counter measure against a snitch-bot would be to incorporate a Captcha challenge into the process of starting a torrent download.

Andrew Farrell said...

As an ISP, wouldn't you want to remove 1/3rd of your traffic?

That depends a lot on whether they're charging per-byte!

Anonymous said...

Simple, protocol-level countermeasure:

1. Encrypt all the data with a key.
2. Split the encrypted data into N blocks.
3. Calculate N subkeys such that XORing all subkeys together yields the key to decrypt all the data.
4. Attach a subkey to each block.
5. Instead of transmitting plaintext blocks, only transmit encrypted blocks with subkeys attached.

Result: you cannot decrypt any portion of the data until all of it has been downloaded. The movie industry must now download the full contents of every torrent they spider before having even the slightest clue which are infringing; this at least buys time for the pirates to distribute to genuine users, and may very well make the the bandwidth requirements for the spidering companies infeasible.

Anonymous said...

This is the single most stupid thing I have heard in a long time.

Why should the MPAA have any kind of privilage? I own copyrights so surely your scheme should also allow me to perform the same filtering the MPAA does.

By extension pretty much *anyone* owns a copyright. Thus AT&T must let anyone block anything they want (would not doing so be considered Anti-trust?).

So now you have a great system where anyone can block any traffic, how highly secure, it doesn't open the entire network up to the threat of DoS attacks does it?

Anonymous said...

They are also known for incorrectly identifying infringement. In essence this scheme could very well block access to legitimate users -- since you're just trusting that the feed you get is correct.

Danny Colligan said...

In response to the comment that begins "This is the single most stupid thing..."

Nobody is defending the actions of the media companies here. What they are asking for is entirely unreasonable. However, what media companies do have is a lot of lawyers and lobbyists. Corporations would rather give them a slice of the pie than get into a grueling, costly legal battle (for example, Microsoft paying royalties on every sale of the Zune). Government officials have passed legislation that has established Bad Ideas like retroactive copyright extensions as law at the behest of the media companies. The media companies' demands may be silly, but that doesn't necessarily stop them from getting their way. Who knows if the ISPs will give in because of pressure from the court room or Capitol Hill?

Nicholas Weaver said...

Or, as I postulate, WANT to give in!

AT&T has publically stated that it wants to wage the copyright battle on the MPAA's behaf. So the question is HOW would they do it, and WHY would they want to?

Steve Simmons said...

The most obvious 'attack' on the proposed system is to introduce false positives. This causes the ISP to block non-infringing traffic, and the sh*t hits the fan. For example, implicate the root DNS servers as torrents and watch the ISP fall over dead.

And yes, the ISP could then start whitelisting various hosts - but that's an endless tail-chasing exercise, and eventually they'll have to give it up as a bad job.

Nicholas Weaver said...

Its actually really REALLY hard to maliciously false-positive on existing flows. Basically, you'd need to pose as the MPAA and create fake DMCA notices, or take over the ISP's control (which you can do probably a lot more interesting things.)

Likewise, by having it be "DMCA a graph" rather than a host, this would not be suitable for censoring a web site.

The interesting question is fair use.

But even then, it has to be something that would be WIDELY distributed, and couldn't be distributed by just putting up on YouTube or some other web site. There are tons of channels with no cost to the distributor for legitimate content, but where a DMCA takedown notice quickly greets violations.

Anonymous said...

Has anyone mentioned the torrents servers/trackers blocking the IP's of the spiders. This is simple website defense for page scraping, right?

Anonymous said...

Having trackers block the spiders just gives another tool for ISPs to use to stop their users sharing copyrighted data. Just cycle your spider through all the IP address ranges used by your customers and wait for the trackers to block your entire customer base. Problem solved with far fewer hassles.

Anonymous said...

"first transfer enough content to verify copyright..."

Verifying infringement can't be automated (consider fair use). Sadly, it also seems like an unnecessary step. The ISPs err on the side of caution and will immediately take down anything the MPAA reports. The ISP won't ask for infringement proof before killing the traffic.

With this in mind, the algorithm becomes... find the most popular torrent, report it as infringing (whether it is or not), repeat.

Anonymous said...

You are too late. There is already a company that is tracking over a million torrents in realtime and about 150 million peers in REALTIME. The company is called divinity Metrics but they use this data for marketing.

Zythom said...

Two quick points:
-- It seems to me that dynamic IPs will be a problem
-- If I introduce a mandatory human action for breaking the automatic side of your case: eg encryption. Verification of copyright need to be mandatory manual.
Am I wrong?

Nicholas Weaver said...

A few comments:

a) I'm not FOR this, but its how I'd DO this.

b) Dynamic IPs are not necessarily a problem, this is short-timescale operations, so they are "Static Enough" during this time.

c) Encryption of the file and later key distribution makes piracy inconvenient, bringing it into the closed world model, and vulnerable to false files etc. Especially from the ISP's viewpoint, this is perfectly fine.

d) There is plenty of precident for automated DMCA takedowns: they happen all the time on Youtube.

Zythom said...

a) I havent said your were FOR that
b) I think dynamic IPs ARE a problem for people who will get a blacklisted IP
c) Encryption's keys are not necessary distributed later: they can come with the file. I just wanted to introduce a human action needed to manualy verify the copyright
d) YouTube videos are not encrypted in such a way.

Nicholas Weaver said...

a) Others have.

b) Firstly, the observation is a blacklist GRAPH. So only if

1) You were reused an IP by a member of the graph.

2) You wanted to communicate with another member of the graph

3) Before the graph's block expires

Would the reuse of an IP be a problem.

As for anything CAPTCHA-like, CAPTCHAs are cheap to solve, and anything easy for a human must cost only $.01/each to outsource to China, because thats already done.

Anonymous said...

torrents, newsgroups, forum sites gratcha's (upcoming) etc. etc. there is always a way to share files.
Otherwise many countries have jurisprediction that it is not possible to help for ISP, on the otherhand many ISP's are working on it, because they have customers for it. Cheaper prices that's the only thing that will solve this problem. As an old and now still professional security IT person specialized in the Internet world-wide I only can say one thing, with all respect this is just an chicken/egg story..... and will not work. Share torrents with password protection, only identified people get the password which is time dynamic, so I know the range of people I have send the information too is just an example how some torrent sharer can hit back. I now more solutions to share torrents only to a specific group of people, fully tracked, thanks to the search engines on the web...

Mind Booster Noori said...

There are two points of failure here:
1) There's no way of making a bullet-proof content identification system (Vobile is a good example about how this systems can fail);
2) Downloading copyrighted material might not be illegal, and there's no way to check it (if I download some mp3's from an album I have, even if it is copyrighted, I'm not doing anything illegal).

Danny Colligan said...

@mind booster:

Are you a lawyer? I'm not sure that "I already own it on CD so I can download it whenever I please" argument would stand up in court (granted, it makes perfect sense to ME, but copyright violations are a guilty-until-proven-innocent kind of game). Copyright is a monopoly on distribution and you are violating that by downloading someone else's copyrighted content. Plus, media companies have for a while tried to assert that making backup copies or more personal copies or changing copyrighted content to another format should be illegal. If you do know of a case where this argument has been made, I'd appreciate a link.

In any event, I doubt this is a consideration AT&T shares.

Anonymous said...

Your technique of crawling torrents for IPs is very similar to the technique used in this paper to discover IPs of suspected MediaDefender attackers.

I am curious how easy it would be to automate the process of checking the media file for copyright infringement. There are so many different file formats posted. Also, the way bittorrent downloads piece by piece, wouldn't you need to download the entire file first?

Anonymous said...

having 20 years of experience in networking I can say for sure that AT&T can not have ACL in routers current hardware is not ready for that. You can black hole IP quite easy but you cannot block communication between 2 IPs.

carterson2 said...

Why don't you patent the idea? I know it would be fun to build, so you could build it, but then what? Run it for Freedom Foundation, or run it for the MPAA? This isn't really neutrality, but I can tell you want to build it as would I ;-)

I still say, patent it. Patents are all bad. Don't believe anyone who tells you otherwise.

carterson2 said...

I meant patents aren't all bad.