Skip to Content

James Comey’s Twitter Security Problem Is Your Problem, Too

Information security used to be able to lock down data. Now we must beware of how algorithms handle our secrets.
April 27, 2017

Until recently, many of the big splashy data leaks—like the Target and Sony hacks—happened because the data wasn’t protected. Proper encryption probably would have prevented those breaches.

Now things are different. We’re seeing more and more problems arise not because data isn’t protected, but because it’s improperly protected.

Until recently, few people outside of academia paid much attention to a concept called “information flow security,” which involves checking data as it interacts with software. Then along came “This Is Almost Certainly James Comey's Twitter Account,” a Gizmodo article that captures everything it means to be a modern information leak.

The story behind the article: a journalist named Ashley Feinberg wanted to find FBI director James Comey’s secret Twitter account. She started digging around the Internet and was able to uncover the account in just four hours, in large part by exploiting a key information flow bug in Instagram.

Feinberg describes how she used facts about Comey and his family to find a public tweet leading her to a public Instagram comment leading her to the protected Instagram page of Comey’s 22-year-old son, Brien.

Jean Yang
Leonard Greco

After Feinberg sent requests to follow Brien, Instagram recommended that she also follow “reinholdniebuhr,” another protected Instagram account that matched what Feinberg already knew about James Comey’s Instagram account. Even better, the Twitter handle with the same name matched what Feinberg knew about Comey’s Twitter account.

There are many things happening here, many of which are outside the control of a software developer at Instagram, but at the core of this leak is an information flow bug. Feinberg relied on several key public facts about Comey, but she would not have been able to find his Twitter account had Instagram not inadvertently provided the vital clues. And Instagram would not have revealed this information had the code been properly enforcing information flow security.

There’s an inconsistency between how Instagram protects the information on account profiles when users try to access it and how it protects this information when it’s used in various algorithms. When you try to view the Instagram page of a protected user, you cannot access that person’s photos or see who that user is following. It turns out, however, that the protected account information is visible to algorithms that suggest other users to follow, a feature that becomes—incorrectly—visible to all viewers once a follow is requested.

In this case the policy violation is particularly insidious because what is actually leaked, reinholdneibuhr’s profile photo and name, are both public across Instagram. What should be private is the relationship between Brien Comey and this reinholdneibuhr account. While it’s possible that Instagram randomly showed reinholdneibuhr as a recommended account to follow out of its 600 million active monthly users, what is more likely, especially given that the other recommended users had the last name Comey, is that Instagram’s recommendation algorithm used secret “follow” information to compute which accounts to recommend. In information flow nomenclature, the leak of secret information through displaying public information is called an “implicit flow.”

Just as proper encryption would have prevented the Target and Sony hacks, there are solutions for preventing information leaks like this one. There are decades of research on information flow security techniques: some that check software before it runs, others that monitor software as it’s running. This work is much more than theoretical: people have built operating systems and Web frameworks based on these ideas. Such systems would have detected if a recommendation algorithm was leaking secret follow information and prevented the leak. But even with these approaches, the programmer still needs to reason about the complex and subtle interaction of policies with each other and with the code to produce software that doesn’t leak information.

In my lab, we’re attempting to make it easier for developers to implement information flow policies. We help the machine to take responsibility for managing the interaction between policies and with the program to make sure recommendation algorithms don’t leak information. The policies also specify what values the machine can use when the actual values must be kept secret. For example, if a search algorithm isn’t allowed to use someone’s exact location, maybe it can use the corresponding city.

Even though digital security should be one of the main concerns of the FBI, Comey couldn’t avoid the problems that arise from the mess of policy spaghetti that is modern code. Though this leak affected many fewer people than the large data breaches, it marks an important shift in information security.

Until now, most people have thought about security in terms of protecting individual data items, rather than in terms of complex and subtle interactions with the programs that use them. But we now live in a world in which the director of the FBI trusts our Internet infrastructure, and Instagram, enough to put 3,227 private photographs online. On the one hand, this means that we’ve reached a certain level of security. On the other, it means that we can now focus on more advanced security problems. And when anyone with good deduction and access to the Internet can find out all sorts of information, our work on information security is far from over.

Jean Yang is an assistant professor in the computer science department at Carnegie Mellon University and was named one of MIT Technology Review’s Innovators Under 35 in 2016.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.