Clustering Attacks on Web Apps to Find the Real Story Behind the Headlines -- Security Today

Clustering Attacks on Web Apps to Find the Real Story Behind the Headlines

Our goal of clustering attacks on web applications is two-fold

By Gilad Yehudai
Jul 11, 2018

Security products which aim to block attacks may do their job perfectly while also reporting the attacks that are blocked. However, one of the biggest problems in the cyber security arena today is alert fatigue, where there are too many alerts to manually process. The largest data breach in history affecting more than 41 million customer payment card accounts could have been prevented if the right action from the visible security alerts were taken. A web site protected by a web application firewall may be targeted by anywhere from hundreds of thousands of attacks to millions of attacks in a single day.

The amount of manpower given for processing and analyzing these alerts is always not enough, and the result is a flood of important data which is not handled and analyzed due to the alert fatigue. However, by leveraging artificial intelligence, we can develop sophisticated machine learning algorithms to cluster alerts to automate and consolidate those alerts, condensing days or weeks of work into minutes.

Our goal of clustering attacks on web applications is two-fold:

Highlight interesting patterns inside the attacks
Distill the massive amount of attacks to a few actionable incidents

Clustering can help us create a “story” out of the attacks (naming them based on behavior), making them more easily understood to a human observer and easier to analyze. For example, when seeing a cluster called, “SQL injection attack from several IPs in China using a Havij scanner,” the story behind it is much clearer than analyzing the thousands of attacks this cluster contains and trying to find the common pattern between them.

A simple “group-by” algorithm that takes the alerts and groups them by a specific attribute is not good enough. The reason is that there is not a single structure for attacks, and there is no single attribute which can define all the attacks. Thus, a more sophisticated algorithm, which considers a general distance function between attacks on web applications, is needed.

The algorithm has three main stages:

Feature extraction
Distance calculation
Clustering of the attacks

Feature extraction

The raw data that enters the algorithm is an HTTP request that contains an attack stopped by the firewall, with some additional fields containing more data about the attack, like the source IP and the type of attack.

By leveraging our web application security domain knowledge, we extract additional meaningful features from the raw data that can help us describe the attack.

For example, the source of an attack is not defined solely by the IP. We also use geolocation services to extract more the about the origin of the IP, like source country, ISP, coordinates, ASN etc. It is also useful to know whether this IP comes from some kind of anonymity framework like TOR or an anonymous proxy.

Distance Calculations

The next task is to determine a way to calculate the distance between two attacks. This is a core stage of the algorithm as it determines when two attacks are similar, which in general is what the algorithm is trying to achieve. Calculating a distance between two points in the plane is easy – there is a precise formula to do it – but how can we calculate the distance between two URLs or two IPs?

We need to find a method to calculate the distance for every meaningful feature we have in our data, and then combine all these distances to find a single measure between two attacks.

Clustering of the Attacks

The final step is to take the data with all the extracted features and the distance measure between attacks to construct clusters of the attacks. In our case, we used a streaming clustering algorithm. This algorithm creates the clusters over time by receiving a stream of data as more and more attacks enter the system.

The importance of clustering in streaming mode is that the attacks are being delivered in real time. This method of stream clustering helps the performance of the algorithms in both time and memory, as not all the attack data is stored in memory all the time, only the current clusters with their unique features.

To conclude, clustering attacks on web applications help to understand the hidden patterns behind the attacks and to make huge amounts of data comprehendible to the human security expert. Constructing such a clustering algorithm requires more than just machine learning knowledge, it requires a high level of domain knowledge in cyber security to understand and construct the various parts of the algorithm.

To read more about clustering of attack on web applications see Imperva’s blog series.

Travelers without REAL ID will have option to pay $45 fee starting February 1
02/02/2026
DHS to End ‘Shoes-Off’ Travel Policy
02/02/2026
Idaho Airport Has Unusual Item Detection
02/02/2026
Altronix Partners with Hanwha Vision to Support New Access Control Offering
01/29/2026
SIA and ISC West Reveal Full Conference Details
01/27/2026
Netwrix Forecasts Growing Dependence on Identity and Data Security as AI Expands Risk
01/27/2026
Genetec Named One of Canada’s Top Employers for Young People
01/22/2026
Schlage America's Most Trusted® Lock for Seventh Consecutive Year
01/22/2026

Featured

Full Details Released for SIA Education at ISC West

ISC West 2026 is coming up March 23–27 in Las Vegas, and the Security Industry Association (SIA) and ISC West have revealed full conference details. Read Now
- ISC West
2025 Gun Violence Statistics Show Signs of Progress

Omnilert, a national leader in AI-powered safety and emergency communications, has released its 2025 Gun Violence Statistics, along with a new interactive infographic examining national and school-related gun violence trends. In 2025, the U.S. recorded 38,762 gun-violence deaths, highlighting the continued importance of prevention, early detection, and coordinated response. Read Now
- Facility Security
Big Brand Tire & Service Rolls Out Interface Virtual Perimeter Guard

Interface Systems, a managed service provider delivering remote video monitoring, commercial security systems, business intelligence, and network services for multi-location enterprises, today announced that Big Brand Tire & Service, one of the nation’s fastest-growing independent tire and automotive service providers, has eliminated costly overnight break-ins and significantly reduced trespassing and vandalism at a high-risk location. The company achieved these results by deploying Interface Virtual Perimeter Guard, an AI-powered perimeter security solution designed to deter incidents before they occur. Read Now
- Artificial Intelligence
The Evolution of ID Card Printing: Customer Challenges and Solutions

The landscape of ID card printing is evolving to meet changing customer needs, transitioning from slow, manual processes to smart, on-demand printing solutions that address increasingly complex enrollment workflows. Read Now
- Access Control
TSA Awards Rohde & Schwarz Contract for Advanced Airport Screening Ahead of Soccer World Cup 2026

Rohde & Schwarz, a provider of AI-based millimeter wave screening technology, announced today it has won a multi-million dollar award from TSA to supply its QPS201 AIT security scanners to passenger security screening checkpoints at selected Soccer World Cup 2026 host city airports. Read Now
- Airport Security

Security Today eNews

Sign up today for essential industry news and product information that can help you stay afloat in the fast-paced world of security.

Email Address*Country*

Please type the letters/numbers you see above.

Webinars

Whitepapers

New Products

PE80 Series by SARGENT / ED4000/PED5000 Series by Corbin Russwin

ASSA ABLOY, a global leader in access solutions, has announced the launch of two next generation exit devices from long-standing leaders in the premium exit device market: the PE80 Series by SARGENT and the PED4000/PED5000 Series by Corbin Russwin. These new exit devices boast industry-first features that are specifically designed to provide enhanced safety, security and convenience, setting new standards for exit solutions. The SARGENT PE80 and Corbin Russwin PED4000/PED5000 Series exit devices are engineered to meet the ever-evolving needs of modern buildings. Featuring the high strength, security and durability that ASSA ABLOY is known for, the new exit devices deliver several innovative, industry-first features in addition to elegant design finishes for every opening.
A8V MIND

Hexagon’s Geosystems presents a portable version of its Accur8vision detection system. A rugged all-in-one solution, the A8V MIND (Mobile Intrusion Detection) is designed to provide flexible protection of critical outdoor infrastructure and objects. Hexagon’s Accur8vision is a volumetric detection system that employs LiDAR technology to safeguard entire areas. Whenever it detects movement in a specified zone, it automatically differentiates a threat from a nonthreat, and immediately notifies security staff if necessary. Person detection is carried out within a radius of 80 meters from this device. Connected remotely via a portable computer device, it enables remote surveillance and does not depend on security staff patrolling the area.
Mobile Safe Shield

SafeWood Designs, Inc., a manufacturer of patented bullet resistant products, is excited to announce the launch of the Mobile Safe Shield. The Mobile Safe Shield is a moveable bullet resistant shield that provides protection in the event of an assailant and supplies cover in the event of an active shooter. With a heavy-duty steel frame, quality castor wheels, and bullet resistant core, the Mobile Safe Shield is a perfect addition to any guard station, security desks, courthouses, police stations, schools, office spaces and more. The Mobile Safe Shield is incredibly customizable. Bullet resistant materials are available in UL 752 Levels 1 through 8 and include glass, white board, tack board, veneer, and plastic laminate. Flexibility in bullet resistant materials allows for the Mobile Safe Shield to blend more with current interior décor for a seamless design aesthetic. Optional custom paint colors are also available for the steel frame.