The Looming Software Catastrophe

A couple of stories this week don’t seem directly connected, but do suggest a theme: Wednesday, an automatic-trading program at Knight Capital went out of control and spewed “a torrent of faulty trades” onto the stock market for half an hour before any humans at the company figured out how to pull the plug. Knight’s losses are estimated at $440 million, which might kill the firm.

Thursday, a Republican filibuster scuttled the proposed cybersecurity bill, which

would have established optional standards for the computer systems that oversee the country’s critical infrastructure, like power grids, dams and transportation.

The Republican senators filibustered even though the bill had already been watered down to the point of being useless. A cybersecurity expert described the original bill as “all sticks and no carrots”. So when companies protested, a compromise made the sticks voluntary. We wound up with a bill that would allow masochistic corporations to submit to a voluntary binding.

Even that couldn’t pass. (The bill also had privacy and civil-liberty problems, but that’s not what brought it down.)

So here’s what we learned this week:

Bad software, even without any apparent malicious intent, can destroy a Wall Street firm in less time than it takes a banker to finish his second martini.
Software controls the power grid, oil and gas pipelines, the banking system, and a few other things that terrorists, hacktivists, hostile foreign powers, or joy-riding teen-agers might like to screw up.
The most toothless imaginable attempt to establish some security standards is too onerous for Congress’ corporate masters to allow.

What can you conclude from that? I think there’s a trend here that won’t stop until there’s a disaster big enough to scare the pants off everybody. And that means: There will be a disaster big enough to scare the pants off everybody.

The software culture. Our software development culture revolves around promising whiz-bang new features. If you can actually make them work most of the time, that’s even better. If they could fail without taking down the whole system, that would be nice — but hey, nobody’s perfect.

The problem is that security, reliability, and resiliency aren’t “features” that you can add to an already-existing product. They are systemic virtues that have to be designed in from the beginning and supported by a continual process of reporting, error containment, and a system-wide appreciation of rigor. Fundamentally, that’s bureaucratic and slows things down. We hate stuff like that.

That is, we hate it until some rogue application starts losing $15 million a minute. Then, slowing things down starts to sound like a good idea.

And that’s just a screw-up. What could somebody do if they really worked at it? That’s the so-called “digital Pearl Harbor” scenario, where an enemy doesn’t bother with bombs or bullets, it just takes down some key element of our infrastructure.

It could be that, or just a glitch, or some hacker’s joke that gets out of hand. But you can’t build vulnerable systems on top of other vulnerable systems forever and think that nothing is going to happen.

The political culture. Thursday’s vote made it clear that the government is not going to get out in front on this. Congress isn’t going to act until either business or the general public demands it. But how is that going to happen?

The business world doesn’t know how to think about stuff like this. Nobody knows how to estimate the odds of a software disaster, or predict the resulting damage, so there’s no box on a spreadsheet that says how much a firm is risking with its current software. Making a company’s software more secure would entail both a capital cost and an operating cost. What can a CEO compare that to in order to get a return-on-investment estimate? Without that estimate, software security just looks like a voluntary reduction of profit. What CEO is going to sign up for that?

But security could show up on a different part of the spreadsheet: not risk assessment, but marketing. Maybe the public would pay more for a secure product than an insecure one. Or, looking at it a different way, people might choose to locate their a bank or brokerage account at a firm that cared about software security.

But how is that going to work? You have no idea how the magic happens when you log into your bank account, and in the absence of a major catastrophe, you have no way to judge whether one system is more secure than another. You’re certainly not going to pay extra for a security claim you can’t judge.

And financial firms aren’t going to advertise that their risk is lower than somebody else’s, because they don’t want you to think online transactions are risky at all. They’re not going to start talking about security until after the public is scared.

A simple technique for predicting the future. Keep these two principles in mind:

A trend will continue until something stops it.
Trends that can’t go on forever won’t go on forever.

So if X is the only thing that could possibly stop a trend that can’t go on forever, sooner or later X will happen.

The trend that can’t go on forever is piling vulnerable systems on top of vulnerable systems. And I think I’ve just talked myself into believing that it won’t end until the public gets scared.

So X, in this case, is an event that scares the general public about software security. I can’t tell you what it’s going to be: a nationwide blackout, some unrecoverable loss of financial records, who knows? But it has to be big enough to break through our collective denial.

I also don’t when that’s going to happen. Tomorrow? Five years from now? But if that’s the only way for an unsustainable trend to end, it’s got to happen.

Comments

Bob Lee On August 7, 2012 at 6:12 pm
Permalink | Reply

“Piling vulnerable systems on top of vulnerable systems” only increases risk if the vulnerabilities align. In most cases they do not, so the only vulnerability that matters is the one in the top recent layer.
mjfgates On August 10, 2012 at 7:46 pm
Permalink | Reply

What makes you think that an event that scares the public will result in the end of the trend that caused that event? We had 9/11, and the result was the Patriot Act and invasions across half the Middle East. We’ve had banking crises since 2008, and the result has been austerity for Everybody But Banks across most of the civilized world. If something bad happens to, say, the US power grid, and investigation determines that software problems were responsible, what we’ll probably get will be attempts to force DRM onto every Turing-complete device on Earth.
David Harmon On August 10, 2012 at 8:53 pm
Permalink | Reply

One way would be to get a law passed making companies liable for their own software disasters… but of course, that would contradict the corporate masters’ rule of “we screw up, you pay”.

Trackbacks

By On and On « The Weekly Sift on August 6, 2012 at 11:43 am

[…] The Looming Software Catastrophe. I can’t predict what it will be or when, but two events this week made it obvious there will be one. […]
By If They Win « The Weekly Sift on August 13, 2012 at 12:19 pm

[…] up on last week’s “The Looming Software Catastrophe“, here’s an NYT article about the impossibility of testing software like the automatic […]

The Weekly Sift