system-downnLast week, I wrote about resilience from the perspective of cyber security. But resilience is more than just a hedge against cyber attacks; it’s a necessary component of 21st Century IT strategy.

For example, flights all over the world were delayed last week when the software used by many major air carriers to manage passenger reservations, tag luggage and issue boarding passes went down.

This apparently wasn’t a case of hacking. The most likely explanation is that a bug in the software caused it to crash. The outage was unexpected and inconvenient, but it was nothing out of the ordinary for those of us in the IT industry.

The software package, called Altea, is used widely in Asia, Europe and the Americas. The software’s developer is a company called Amadeus. According to the New York Times, Amadeus software is used by 189 airlines.

Amadeus blamed the outage on a network issue, which sounds reasonable considering the complexity and scale of global air travel systems. I find it fascinating, however, that Altea was up and running again within hours. From my point of view, that demonstrates the power of resilience.

My guess is that while passengers waited and airline executives calculated the costs of delayed flights, the IT team at Amadeus switched over to backup systems and focused on fixing the problem.

Obviously, whatever they did worked, because the outage was quickly resolved and the airlines resumed their regular schedules.

When IT professionals plan for the unexpected, they’re rarely caught off guard.  It seems to me like the team at Amadeus knew what it was doing, and worked methodically to restore the impacted systems.

I believe we can all learn a lesson from last week’s outage. It could have been worse, but it wasn’t – thanks to a quick response and a resilient IT strategy.

As IT leaders, we should be practicing with our teams for those unexpected moments when the systems crash and the IT help desk switchboard lights up with calls from angry customers and concerned executives.

Are you ready for the unexpected? Are you practicing? Are you holding drills and testing the readiness of your response teams?

I certainly hope the answers to those questions are all affirmative. Even without cyber hackers, our systems are vulnerable to accidents. Eventually, our luck runs out and something goes wrong. That’s the nature of this business.

The big question remains: Are you prepared for the next outage?