“What just happened in my network?”
Many of us turn to our IT security team to answer this question. It’s answered by analyzing data on scheduled increments – after the data enters into your system.
This after-the-fact analysis is clearly not adequate to secure data against today’s cybercriminals. Even a company as large and security-conscious as Target took two weeks to discover it had a security breach with its credit card data affecting more than 70 million individuals.
Modern data-center teams must move beyond “What just happened?” to “What’s going to happen next?”
How to Predict the Future
To predict the future of your network’s security, you need to process, enhance, cross-correlate, baseline and analyze information as it comes into the system – also known as “real time.” You then have the context to understand who, what, when, where, why and how of the data.
Step 1: Parse the raw messages according to common attributes, also known as “metadata.”
You need to be able to parse raw messages into a common list of attributes that are used by all of the different devices and applications making up your network. An attribute could be the source IP, destination IP, user or destination port. By grouping data according to metadata, you can easily search and find data. For example, you could run a report and ask the system what sources are going to this destination IP address. This pre-processing step allows the system to automatically cross-correlate all of the possible interactions.
Step 2: Enhance the raw message with additional known facts.
Enhance the original message with all possible sources of information. Your raw message may contain information about source IP, destination IP, destination port and that it was allowed through the firewall. This data can be enhanced by adding the host names of the source IP, destination IP addresses and geo-location information of those IP addresses. Finally, you may know information about the end user that was logged into those known IP addresses.
The message now has additional attributes that have enhanced the original message, providing valuable context to your analysis.
Step 3: Understand what is in your environment.
By using a machine intelligence database, your system can keep track of information such as host names, serial numbers, IP addresses, labels on interfaces, locations, running and installed software, patches, firmware, users, LDAP group memberships, and layer-2 and layer-3 topology maps. This information is populated from logs and scheduled discoveries of the devices in your environment and should be done virtually, without having to install agents onto your devices. Users and the system itself can leverage this information when investigating issues.
You have now created an environment in which machine learning can occur. For example, this database contains a mapping of Intrusion Protection System (IPS) signatures to vulnerabilities and patches. If your network IPS issues an alert that your web server is being attacked, the system can automatically determine whether it is really vulnerable to the attack or not. This will automatically reduce noise by classifying attacks as a legitimate threat or a false positive.
Step 4: Learn what is normal so you can determine what is abnormal.
To know what is normal in your system, you need to understand the entire environment by tracking every monitored device. Ask yourself:
- What resources are being consumed over time?
- Who has communicated to your systems and how they are doing it?
- How are your systems communicating to other systems?
This is accomplished by establishing a baseline created by your own system. Many security systems create baselines; however, they use default information created by the manufacturer, rather than information specific to your actual system. A truly useful baseline must contain your own variables and use cases.
For example, how many connections are typically permitted or denied by the firewall? How many errors are typically seen on servers? How many logons are there to the domain controller or the database servers? What is the IO rate from storage systems? This baseline will learn and adjust as your network evolves. Once baselines are known, deviations can be detected.
Step 5: Have your system create watch lists.
The next step is to teach your system what to look out for and to create watch lists with suspicious events. For example, you could teach the system to detect a device that is asking for an IP address (DHCP) check and see whether it matches a list of devices that were part of your domain, or if it matches the naming schema within your corporation. If it does not match those criteria, then add it to a watch list. This watch list is populated via running reports, lists from users or from rules triggering (a rule is really a real-time query looking for something).
So, if a hacker connects to your network and the computer was not previously added to the domain or it did not match the naming schema for computers in your corporation, it will automatically be added to this watch list. Now you can have reports and rules examining the devices in these watch lists, and if something abnormal occurs, the system automatically takes corrective action or alerts a human being to investigate.
Step 6: Predictive analytics fix issues before they become problems.
Now that your devices and applications have established baselines, growth and resource trends can be analyzed. You will be able to predict when you will run out of disk space, networking bandwidth or connections to your web application and you can automatically add more resources before there is a shortage.
Over time, the system can learn and determine patterns from the masses of data it accumulates. It will create its own rules and watch lists independent of user interaction. This is when real machine learning can occur. Machine learning will move your security from “What just happened?” to “What’s happening now?” to “What’s going to happen next?”
Now the system can either take action itself to automatically remediate, or it can alert a human being to take action. The alert can provide all of the facts (who, what, when where, why and how) surrounding an incident, so a person can investigate and stop a breach before the loss of valuable corporate or customer data occurs.
And, your IT team can evolve from just defending against security attacks to proactively preventing cyberthreats before they occur.