Artificial intelligence (AI) in data centers is a double edged sword: not only is AI learning one of the reasons that we need more data, but it will also be the way to optimize data centers to keep up with this demand. Recently, the research and advisory company, Gartner, predicted that more than 30% of data centers that fail to deploy for AI and machine learning will no longer be operationally or economically viable by 2020. This statistic is not exactly startling, as AI and machine learning dominate more industries, but it is definitely something data centers need to begin preparing for over the next few months. Below, we will examine some of the ways that AI and machine learning will help data centers improve their services.
This use for AI has recently been in the news as researchers at MIT have created a system that will use reinforcement learning to tailor scheduling decisions to specific workloads in specific server clusters. Reinforcement learning is when humans supply a model to the system, along with unlabelled data. When the system determines the optimal outcome for the data, it is reinforced with a 'reward'. Having developed the newest machine learning technique, MIT has leveraged this type of learning to teach it’s system optimal operations and workflows in data centers. Specialized AI chips are also emerging across the industry to help train AI models and improve algorithms. These chips can be implemented into the server directly to improve the incorporation of AI tools and techniques.
Workflows were traditionally determined by professionals in the data centers, but they can be difficult to monitor due to the fact that data centers are continuously running and the personnel are often limited by time and resources. In the case of the MIT system, human intervention is only needed for simple instructions that tell the system what they want to accomplish.
In the case of data centers, they contains thousands of servers that are constantly running data-processing tasks. AI systems, such as the one developed at MIT, use cluster scheduling algorithms to allocate tasks across all servers, in real time, to use all available computing resources. It allows for all servers to be used efficiently, getting tasks done around 20 to 30 percent faster. Due to the predictive and self-improving nature of machine learning, it will soon be able to predict demands and optimize scheduling before requests are made.
These days, there is a never-ending barrage of data breaches being covered on the news. Data centers are no exception, with the large amount of information that they store. Although there are dedicated cyber security experts that attempt to find and remove cyber security threats, finding and analyzing these threats can be extremely labor intensive.
Fortunately, AI is being leveraged in many businesses to help reduce the threat of data breaches, including data centers. AI for data breaches can function in a variety of ways. Through machine learning, AI can learn normal network functions and behaviors. When cyber threats arise, the network will deviate from these normal behaviors, leading AI to report these deviations and close down the threat. Machine learning can also be leveraged to detect security loopholes and malware, and can analyze incoming and outgoing data for threats. Using AI in this method is more secure than using cyber security experts, as they can be prone to human error.
Outages in data centers is a serious issue, costing anywhere between $100,000 to $1 million per hour of downtime. Predictive and preventative AI learning can also help to stop outages at data centers. By learning what normal network behaviors are, which they do to predict security threats, they can use this information to predict when a data outage will happen.
Staff in the data centers have the ability to predict downtime manually by decoding and analyzing past issues to find the root causes of outages. Unfortunately, subtle differences cannot be detected by humans. Software in the facilities use sensors and sound to automatically track patterns. They also monitor server performance, network congestions and disk utilization. All of this information is then used to identify and solve issues. They can also use their predictive learning to anticipate failures before they happen, saving centers lost revenue resulting from outages.
Overall, the shift to machine learning should benefit data centers by lowering costs and improving efficiencies. With sustainability and environmental impact quickly becoming important issues for data centers, the first place we will likely see a surge in AI for data centers will be in effectively and consistently cooling the physical spaces. This will enable facilities to open in new, warmer climates while still maintaining lower cooling costs and minimizing energy consumption. As the reliance on data grows, it is important to use new technologies to help data centers keep up with demand.