What Did Facebook Know, and When Did They Know It?


Kidding. But also, not really, because the tl has been full of debate - yes, still - about whether or not the Facebook/Cambridge Analytica breach/hack/whoopsie-doodle is actually a breach, hack, or whoopsie-doodle. In the last 24 hours I've seen people insist that Facebook's data access was definitely a breach, definitely not a breach, clearly a hack, and just a "loophole" that Cambridge Analytica happened to exploit until it was closed for mysterious reasons that no one is admitting to, but which certainly don't indicate either a bug Facebook knew about and considered low priority, or a bug they allowed people to actively exploit because it benefited them.

Sure!

Anyway, breaches and hacks, and any other word you could possibly come up with to describe an attacker gaining unauthorized access to systems or data or both, do indeed happen all the time because of bugs (or "loopholes", which is a weirdly popular term to use in this conversation, for some reason). One of the most popular ways to get a foothold in a system is to just ask the system about itself. We developers are not super great at thinking like attackers, to broadly generalize, and we have all kinds of useful tools that we sometimes accidentally leave publicly exposed or otherwise fail to secure. I'm not a pentester, but as I understand it, that's basically part of the point of Shodan.

I also feel at pains to point out that if I convince an old lady that I'm her granddaughter, and she gives me lots of money, and she is happy to do so because I am her beloved granddaughter, I have still conned her. Because I'm not her granddaughter in reality! She gave me money because I lied! It counts! Falsifying scenarios to get people to give you something you shouldn't have is the basis of social engineering!

But I digress, because ultimately, the real question is not whether or not this counts as a "breach". Facebook would love for you to think that is the real question, because if we are all out here debating breach vs bug or whatever, we don't stop to ask Facebook questions like:

  1. When did you know about this bug?
  2. Why did you close it when you did?
  3. How many accounts were affected?
  4. Why didn't you do anything sooner?
  5. What processes does Facebook have in place to formally evaluate risk and privacy as both impact its users?

The answer to 1 is pretty simple: a while!. Here's an old article about how they're closing off the access to friends' data. Of course, this article indicates that a full year passed between Facebook's "privacy concerns" and actually shutting down the access. A year in human time is quite a lot longer if you are a computer automatically gathering as much data as you possibly can! So, even this weakly expressed "privacy concern" is indicative of profound irresponsibility (and enabling on the part of tech media, but we all knew that already).

The answer to 2 piggybacks off 1: they closed it after a year because uhhhhhhhh (long gif of Zuck's unblinking stare) privacy! How much they knew about how data was being abused appears to still be a mystery. "Privacy concerns" is very vague language, and the TechCrunch quotes from Facebook representatives makes it sound like Facebook's spin, at the time, was that shutting down the API was effectively preventative: part of a migration, yes, but also part of an effort to make people feel more comfortable on Facebook-the-platform. This isn't the language you'd use if you were on the defensive after a breach you knew about - unless, of course, you were confident in your ability to cover up the extent of the abuse that had already occurred.

3: Hey, thanks to the Cambridge Analytica news coverage, we have a partial answer to this! 87 million Americans had their data compromised - sorry, legally accessed but "improperly shared" - by Cambridge Analytica. WHOOPSIE DOODLE!

Most adults would probably point out that the answer to 4 is money, and not alienating all their developers by forcing a migration to Graph API 2.0 quickly. As a developer, I have some sympathy for this; dealing with dependencies' breaking changes is really difficult. But of course, we're not just talking about the Python2/Python3 str handling. We're talking about "improper access" to the tune of 87 million individuals' digital dossiers. Either Facebook's nebulous concerns about privacy were truly nebulous concerns, or they knew about the "improper access", knew it violated their policies, knew the depth of data being accessed represented a privacy crisis, and covered it up.

And so, to me, 5 is the big one. Tech companies historically absolutely hate being held accountable, and sell themselves to employees on the myth that "processes" such as, oh, making sure your self-driving car won't kill someone, or making sure HR stops John in Accounting from being a serial predator, are simply innovation-killing redundancies. Facebook loves to pretend it's Just A Website, even as every piece of evidence suggests that Facebook, internally, sees itself as a kind of godlike panopticon-powered force for good. And, of course, every piece of evidence also suggests that Facebook behaves like a godlike panopticon-powered force for evil. Or chaos, if we're being generous.

Either way, the truth is that moving to shut down access to friends' data as part of your API update indicates significant internal conversations happening. Facebook appears to be pursuing a PR strategy that would have people believe that they simultaneously maintain enormous stores of data on individuals - massive dossiers that are formatted well enough to be not just accessible to outside organizations, not just usable, but wildly popular and profitable - while also having zero controls on who gets that data. This is, of course, incredibly believable within the context of tech. But it betrays a profound irresponsibility that would have any other industry being dragged in front of regulators faster than you could say "Kinder egg". Imagine if Ford started selling cars with untested brake lines that you had to hook up yourself, and the only mention of this flaw - sorry, "loophole" - was buried 50 pages into a 2000 page manual.

This is what Facebook is asking people to accept: they had the data, they knew the data was being abused, and at some point they talked about privacy concerns related to unfettered access to the data. They have thus far refused to account for what exactly they knew about the extent of data abuse, and when they came to know about it. They have insisted that Cambridge Analytica's access is not a breach, was "improper" access, and was in fact a violation of the terms of service. These might all be true - perhaps Facebook only considers it a breach if I brute force Sheryl Sandberg's password! - but, taken together, Facebook's public statements show a sort of petulant misunderstanding of why the public is even asking them questions. After all, says Facebook's official representative, "everyone involved gave their consent. People knowingly provided their information[.]". Surely everyone reads those long statements of deliberate obscurity and complexity before using one of the 95 apps they have on their phone? If not, why not? etc. Facebook desperately wants to turn the attention away from their own shortcomings, because the truth can only be one of two things: either Facebook knew about "improper" use of their API and turned a blind eye, or they were so arrogant, and so irresponsible, that they never bothered to pay attention to their own Orwellian creation.

We should consider either option equally chilling, and equally actionable.