Exploiting machine learning in cybersecurity

Thanks to technologies that generate, store and analyze huge sets of data, companies are able to perform tasks that previously were impossible. But the added benefit does come with its own setbacks, specifically from a security standpoint.

With reams of data being generated and transferred over networks, cybersecurity experts will have a hard time monitoring everything that gets exchanged — potential threats can easily go unnoticed. Hiring more security experts would offer a temporary reprieve, but the cybersecurity industry is already dealing with a widening talent gap, and organizations and firms are hard-pressed to fill vacant security posts.

The solution might lie in machine learning, the phenomenon that is transforming an increasing number of industries and has become the buzzword in Silicon Valley. But while more and more jobs are being forfeited to robots and artificial intelligence, is it conceivable to convey to machines a responsibility as complicated as cybersecurity? The topic is being hotly debated by security professionals, with strong arguments on both ends of the spectrum. In the meantime, tech firms and security vendors are looking for ways to add this hot technology to their cybersecurity arsenal.

Pipe dream or reality?

Simon Crosby, CTO at Bromium, calls machine learning the pipe dream of cybersecurity, arguing that “there’s no silver bullet in security.” What backs up this argument is the fact that in cybersecurity, you’re always up against some of the most devious minds, people who already know very well how machines and machine learning works and how to circumvent their capabilities. Many attacks are carried out through minuscule and inconspicuous steps, often concealed in the guise of legitimate requests and commands.

Others, like Mike Paquette, VP of Products at Prelert, argue that machine learning is cybersecurity’s answer to detecting advanced breaches, and it will shine in securing IT environments as they “grow increasingly complex” and “more data is being produced than the human brain has the capacity to monitor” and it becomes nearly impossible “to gauge whether activity is normal or malicious.”

Stephan Jou, CTO at Interset, is a proponent of machine-learning-powered cybersecurity. He acknowledges that AI is still not yet ready to replace humans, but it can boost human efforts by automating the process of recognizing patterns.

What’s undeniably true is that machine learning has very distinct use cases in the realm of cybersecurity, and even if it’s not a perfect solution, it is helping improve the fight against cybercrime.

Attended machine learning

The main argument against security solutions powered by unsupervised machine learning is that they churn out too many false positives and alerts, effectively resulting in alert fatigue and a decrease in sensibility. On the other hand, the amount of data and events generated in corporate networks are beyond the capacity of human experts. The fact that neither can shoulder the burden of fighting cyberthreats alone has led to the development of solutions where AI and human experts join forces instead of competing with each other.

MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) has led one of the most notable efforts in this regard, developing a system called AI2, an adaptive cybersecurity platform that uses machine learning and the assistance of expert analysts to adapt and improve over time.

Humans and robots have no other choice than to unite against the ever-increasing threats that lurk in cyberspace.

The system, which takes its name from the combination of artificial intelligence and analyst intuition, reviews data from tens of millions of log lines each day and singles out anything it finds suspicious. The filtered data is then passed on to a human analyst, who provides feedback to AI2 by tagging legitimate threats. Over time, the system fine-tunes its monitoring and learns from its mistakes and successes, eventually becoming better at finding real breaches and reducing false positives.

Research lead Kaylan Veeramachaneni says, “Essentially, the biggest savings here is that we’re able to show the analyst only up to 200 or even 100 events per day,” which is considerably less than the tens of thousands security events that cybersecurity experts have to deal with every day.

The platform was tested during a 90-day period, crunching a daily dose of 40 million log lines generated from an e-commerce website. After the training, AI2 was able to detect 85 percent of the attacks without human assistance.

Finnish security vendor F-Secure is another firm that has placed its bets on the combination of human and machine intelligence in its most recent cybersecurity efforts, which reduces the time it takes to detect and respond to cyberattacks. On average, it takes organizations several months to discover a breach. F-Secure wants to cut down the time frame to 30 minutes with its Rapid Detection Service.

The system gathers data from a combination of software installed on customer workstations and sensors placed in network segments. The data are fed to threat intelligence and behavioral analytics engines, which use machine learning to classify the incoming samples and determine normal behavior and identify outliers and anomalies. The system uses near-real-time analytics to identify known security threats, stored data analytics to compare samples against historical data and big data analytics to identify evolving threats through anonymized datasets gathered from a vast number of clients.

It’s not about replacing humans, but about making them superhumans Caleb Barlow, IBM Security

At the heart of the system is a team of cybersecurity experts who will go through the results of the machine learning analysis and ultimately identify and handle security incidents. With the bulk of the work being carried out by machine learning, the experts and software engineers can become much more productive and focus on more advanced concepts, such as identifying relationships between threats, reverse engineering attacks and enhancing the overall system.

“The human component is an important factor,” says Erka Koivunen, cybersecurity advisor at F-Secure. “Attackers are human, so to detect them you can’t rely on machines alone. Our experts know how attackers think, the very tactics they use to hide their presence from standard means of detection.”

Sifting through unstructured data

While data gathered from end points and network traffic help in identifying threats, it only accounts for a small part of the cybersecurity picture. A lot of the intelligence and information required to detect and protect enterprises from emerging threats lies in unstructured data such as blog posts, research papers, news stories and social media posts. Being able to make sense of these resources is what gives cybersecurity experts the edge over machines.

Tech giant IBM wants to bridge this gap by taking advantage of the natural language processing capabilities of its flagship artificial intelligence platform Watson. The company intends to take advantage of Watson’s unique capabilities in sifting through unstructured data to read and learn from thousands of cybersecurity documents per month, and apply that knowledge to analyze, identify and prevent cybersecurity threats.

“The fascinating difference between teaching Watson and teaching one of my children,” Caleb Barlow, vice president at IBM Security, told Wired, “is that Watson never forgets.”

Combining this capability with the data already being gathered by IBM’s threat intelligence platform, X-Force Exchange, the company wants to address the shortage of talent in the industry by raising Watson’s level of efficiency to that of an expert assistant and help reduce the rate of false positives.

However, Barlow doesn’t believe that Watson is here to replace humans. “It’s not about replacing humans, but about making them superhumans,” he said in an interview with Fortune.

If the experiment is successful, Watson should deploy to enterprise customers later this year as a cloud service named Watson for Cyber Security. Until then, it has a lot to learn about how cybersecurity works, which is no easy feat.

Cybersecurity startup Massive Alliance uses a slightly different approach to glean information from unstructured data. Its cybersecurity platform Strixus uses a set of sophisticated proprietary tools that anonymously gather data related to its customers from the surface web (public search engines), deep web (non-indexed pages) and dark web (TOR-based networks).

The collected data is analyzed by a sentiment-based machine learning engine that discerns the general emotion of content. The mechanics behind the technology include mathematical engines that produce adaptive models of behavior of threat actors and determine the danger they pose against the client. The results are finally submitted to analysts who process the information and spot potential risks.

This technique gives the cybersecurity firm the unique ability to monitor billions of results on a daily basis, identify and alert about the publication of potentially brand-damaging information and proactively detect and prevent attacks and data loss before they happen.

“To date, human intelligence is still the most pointed form of intelligence and can be the most effective in a specific operation or crisis,” says Brook Zimmatore, the company’s CEO. “However, focus on Machine Learning technology across any industry is vital as human efforts have their limitations.”

Will artificial intelligence replace cybersecurity experts?

It’s still too early to determine whether any of these efforts will result in cybersecurity experts being totally replaced by machine-learning-based solutions. Maybe the balance will shift in the future, but, for the moment, humans and robots have no other choice than to unite against the ever-increasing threats that lurk in cyberspace.