close
close

Gottagopestcontrol

Trusted News & Timely Insights

Two strategies to protect your business from the next major technology outage
Alabama

Two strategies to protect your business from the next major technology outage

The CrowdStrike event in July clearly demonstrated the risks of giving a software vendor deep access to network infrastructure. It also raised concerns about the concentration of digital services in the hands of a few companies. One prescient Reddit post noted that CrowdStrike was a threat vector for many of the world’s largest companies, as well as a goldmine of data.

large-scale software failures

In light of the global computer shutdowns following CrowdStrike’s failed update on July 19, prudent executives are asking themselves, “How can I prevent this from happening again?”

Given the market concentration among large technology companies, it is quite possible that such a large-scale outage could happen again. According to Synergy Research Group, the top three cloud providers – Amazon, Microsoft and Google – account for 67% of the global market. Amazon alone had a 31% market share at the end of 2023.

Two strategies could mitigate the impact of similar software bugs: diversifying your network infrastructure and practicing in case a bug does occur. Before we talk about defenses, let’s discuss the risks that come with bringing CrowdStrike or other third-party software providers into your organization.

CrowdStrike crash – the tip of the iceberg

Granting device access to a third-party software or service provider entails the following risks:

  • Loss of access to network functions (as in the CrowdStrike incident)
  • Unauthorized access to data (are your IP and customer data secure?)
  • Transparency of your business activities through aggregated data

In addition, your data security now depends on the security practices of a cybersecurity company or cloud service provider.

Consider “mobile device management” or “device monitoring” tools. Most of these are essentially rootkits that give third parties 100% control over your company’s devices. This seems ill-advised for any company that owns intellectual property and wants to keep it secret.

Yes, CrowdStrike screwed up and spectacularly took down several million Windows computers. But crashing Windows computers is just the tip of the iceberg. The bigger threat we’ve collectively and conveniently overlooked is another entity having power over your business operations.

Advanced security software is essential, but under the guise of providing security dashboards, you are giving someone else the keys to your network.

People worry about Facebook tracking and disabling third-party cookies for their personal lives, but software like CrowdStrike’s can watch, monitor and track any company computer, from the humblest intern to the CEO, and cookies are the least of their worries.

Even if CrowdStrike is reliable and the software works as intended, what happens if someone hacks CrowdStrike? The attacker would theoretically have access to the networks of airlines, banks, and the who’s who of global companies. That worries me. There has to be a risk involved in giving a vendor such extensive network access.

How can you, as a CIO or CISO, reduce the risk of another large-scale failure of these Big Tech companies?

Prepare for failure: plan for it, practice it, expect it

The key to preventing further large-scale system failures is to plan for disasters and practice your response. Make dealing with failures part of normal business practice. When failures are unexpected and infrequent, the processes for dealing with them are untested and may even lead to actions that make the failure worse.

Build a network and a team that can adapt and respond to outages. Remember when insurance companies ran their own data centers and disaster recovery tests were conducted twice a year? Few companies go that far with disaster planning today, but some, like Netflix, are leading by example with chaos engineering. Netflix’s open-source Chaos Monkey software intentionally introduces disruptions into a system and simulates real-world outages to test a system’s resilience.

Be more like Netflix and less like Delta Airlines: Delta’s critical crew tracking system was offline for nearly a week following the CrowdStrike update.

Diversify your suppliers and systems

The second strategy to minimize large-scale outages is to avoid the software monoculture that comes from the concentration of digital technology vendors. This is more complex, but worth it.

Some companies buy their core network equipment from three or four different vendors. This makes day-to-day management a little more difficult, but it gives them the peace of mind that if one vendor fails, the entire network won’t be ruined. Whether it’s technology or biology, a monoculture is extremely vulnerable to epidemics that can destroy the entire system.

If in the CrowdStrike scenario the corporate networks had been a mix of Windows, Linux and other operating systems, the damage would not have been as great.

For the “diversify your systems” school of thought, one example is the Rogers Communications outage in Canada in July 2022. The Canadian telecommunications provider experienced a major outage of its cable internet and cellular networks, affecting more than 12 million users for up to 26 hours.

Recovery efforts were complicated by Rogers employees’ frequent use of Rogers’ downed cellular and internet systems. Employees who were out of the office were unable to access the internet or even use their cell phones. A third-party review found that Rogers employees were unable to access critical error logs detailing the root cause of the outage until 14 hours later.

Diploma

Third-party software providers and cloud services are an integral part of the IT landscape, but if we want to minimize the risk to our business, we must resist the temptation to put all our eggs in one basket.

The lessons from CrowdStrike are: diversify your suppliers and systems and dust off your contingency plans.

LEAVE A RESPONSE

Your email address will not be published. Required fields are marked *