May 26, 2024
The cyber risks of overheating data centers


VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More


The heat is on. Climate change creates new challenges for data centers while exposing a new vulnerability that attackers can quickly weaponize. The burning problem of overheated servers caused by record heat waves has melted down data centers from Los Angeles to London. 

Many data center cooling systems weren’t designed to withstand the heat waves the world is experiencing today. Cooling systems are failing under the strain, allowing servers to overheat, leading to many of the world’s most popular websites and applications crashing. 

Attackers want to weaponize heat    

Companies who trade off lower energy costs for running a slightly hotter data center are inviting a breach or, at the least, a data center meltdown. No one cost-reduced their way into a secure data center. Sustainability is the path away from spiraling energy costs. 

Attackers aim to weaponize heat and exfiltrate billions of dollars in data from data centers by attacking cooling systems. From cybercrime groups to sophisticated Advanced Persistent Threat (APT) attack teams, many funded by nation-states expect more data center attacks where heat is the attacker’s weapon.

Event

AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.

 


Learn More

Don’t invite cyber risk by overheating data centers 

Datacenter costs continue to spiral to record levels for many companies, with energy costs outpacing all other expense categories. Making cooling as efficient as possible is critical to data center profitability. Cooling accounts for approximately 40 percent of a data center’s energy consumption. While data centers continue to make strides in improving energy efficiency by phasing in sustainability, starting with improved cooling methods, many are introducing greater cyber risk by marginalizing how far they could go with sustainability. 

“Data centers are big energy consumers—a hyper scaler’s data center can use as much power as 80,000 households do. Pressure to make data centers sustainable is therefore high, and some regulators and governments (including Singapore and the Netherlands) are imposing sustainability standards on newly built data centers,” according to McKinsey.

Despite record levels of capital investment in sustainability, data centers still see overheated servers prone to failure, leading to outages. New cost-effective cooling technologies, including outside air cooling, are cost-effective, yet they can introduce contaminants into a data center infrastructure and potentially damage hardware.    

Another approach data centers take to reduce cooling costs is raising server inlet temperatures. It’s a calculated risk that the cost savings will be worth the increased risk of potentially causing server CPUs to fail. It’s well-known in data centers that servers are the single greatest cause of outages, making the cost savings questionable of allowing temperatures to rise. Server outages cause 30% of all data center interruptions and outages. Heat-induced server failures drive unplanned outages that disrupt data center operations and can cause websites, apps, and online storage to fail unpredictably, costing billions of dollars in lost productivity. 

VentureBeat interviewed several data center recovery specialists who spoke on condition of anonymity regarding how chronic data center overheating is. They affirmed that data centers are redlining to save on costs, with many struggling to keep server inlet temperatures below 80°F, the consensus standard for server cooling. Cost savings are winning out over reducing cyber risk. “Data centers are in for a wake-up call if climate change continues to deliver triple-digit heat waves and they don’t get serious about long-term, more sustainable and affordable cooling that doesn’t invite more risk,” a leading data center recovery specialist told VentureBeat. 

Twitter’s Sacramento data center going offline due to extreme heat in 2022 was prescient of how extreme heat could affect data center performance in the future. In an internal memo to engineers, Carrie Fernandez, Twitter’s vice president of engineering wrote, “On September 5th, Twitter experienced the loss of its Sacramento (SMF) data center region due to extreme weather. The unprecedented event resulted in the total shutdown of physical equipment in SMF.” Fernandez says that the company’s data center was in a “non-redundant state” after extreme heat caused an outage at its Sacramento data center. She called the incident “unprecedented” and said the heat wave led to “the total shutdown of physical equipment.” The Twitter outage originated due to an extreme heat wave. Cyberattackers noticed this and other extreme heat-based outages and continue to fine-tune their tradecraft to attack HVAC, electricity and redundant power systems.  

Specialists cite an incident in 2021 as a cautionary tale of redlining server heat to save on costs. A data center operator in Singapore raised temperatures to borderline unsafe levels to save on cooling costs, leading to the data center servers melting down and widespread server failures. The meltdown lasted nearly a week, leading to thousands of customers experiencing outages.

Data center attacks that weaponized heat   

Attackers are fine-tuning their tradecraft and creating malware that attacks cooling systems to force a data center meltdown to get their ransomware demands met or make a political statement. 

A data center in Atlanta, Georgia, was hit with a cyberattack in 2018 that led to the shutdown of several city services, including the municipal court, the police department and the Hartsfield Atlanta airport. Cyberattackers used a variant of SamSam ransomware designed to encrypt data on every available server. Attackers also penetrated the data center’s cooling system, causing temperatures to rise above 100 degrees, damaging server CPUs and related silicon-based equipment. Cyberattackers demanded a  $51,000 Bitcoin to unlock servers and release their cooling system control.

An Iranian data center was the victim of a cyberattack in 2019 that disrupted its power supply and cooling systems, causing servers and supporting systems to overheat quickly. An adversarial nation opposing Iran’s nuclear program took responsibility for the attack, using the malware program Stuxnet designed to target and bring down industrial control systems. Iranian data center operators say the malware caused the centrifuges at the data center to spin out of control and break down.

A data center in Singapore was attacked in July 2022, disrupting several government agencies, banks and media outlets’ online servers. Attackers exploited a firewall vulnerability, causing servers to malfunction due to overheating. An Indonesian hacking group took responsibility for the attack, claiming it was in retaliation to Singapore’s ongoing support of Myanmar’s military junta. 

Striking a balance between security and sustainability

Data centers face the challenging paradox of continually increasing storage volume, reducing access latency, controlling costs and finding new ways to harden themselves from cyberattacks. Adding to the challenges is the pressure data centers are to reduce their environmental impact and energy consumption, as data centers account for about 1% of global electricity use and about 0.3% of global greenhouse gas emissions. Data center operators are creating innovative new strategies to achieve these challenging goals. They include relying more on renewable energy sources, water-efficient cooling systems and waste heat recovery technologies to improve sustainability. 

VentureBeat has learned that the following strategies are paying off the most from data center owners and recovery experts implementing these programs: 

Get in the habit of conducting detailed thermal mapping to identify hot spots and optimize cooling.

Datacenter recovery specialists say this is a blind spot for many data center operators who procrastinate getting thermal mapping done periodically. Given how quickly servers can degrade over time when exposed to extreme temperatures, it’s a good idea for this task to become part of any data center’s muscle memory. 

Consider how AI can help improve power consumption, strengthened with eco-friendly chillers and evaporative cooling.

The benefits AI can bring to the data center are just beginning, according to the experts and data center operators VentureBeat spoke with. One considered AI optimization critical to their success in meeting sustainability benchmarks needed to achieve internal and regulatory standards benchmarks. Cautious of exceeding server inlet temperatures, more data centers are also using AI to interpret and trigger alerts and actions in real time, adjusting dynamically to prevent overheating while maximizing efficiency.

Redundant cooling systems with fault-tolerant power sources are the future of data center cooling.

It’s undeniable that the upsurge in heat waves and the data center failures across Europe, the United States, and the major one in London last summer are leading indicators of an entirely new type of temperature challenge data centers must take on. 

Using AI to optimize data center asset inventories is gaining traction.

It’s a perfect use case for AI and machine learning (ML) algorithms that can be trained to optimize hardware and system configurations for an increasingly complex series of constraints that data centers need to operate within. Using AI-based optimization techniques can factor in sustainability requirements, resource loads and cooling requirements by server CPU, all focused on creating the optimal environmental conditions for a data center to perform at peak performance. 

Data centers are in a race to improve cybersecurity and sustainability.

As the data center industry strives to reduce its environmental footprint, it must balance sustainability and cyber-resilience goals. Sustainable solutions like outside air cooling, for example, that deliver energy savings, can amplify security risks if not managed as part of a broader data center cybersecurity plan.  

In the race to improve data center sustainability, the operations and the companies operating them can’t lose sight of securing cooling and infrastructure without sacrificing them for cost savings. It’s time to embrace sustainability over risk.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.



Source link