The Evolution of Scene Intelligence
How video analytics are becoming smarter and more accurate
- By Robert Muehlbauer
- Sep 18, 2023
Remember the early days of video analytics? All the alerts triggered by a passing shadow? Or leaves quivering in the breeze? Even a car with bright headlights driving by? Because these analytics were based solely on pixel changes, they tended to generate a lot of false alarms. In some cases, the number of false alarms generated by these first analytics became so frustratingly high that some users decided to simply turn them off altogether.
Fast-forward and today you will find that video analytics has come a long way thanks to better image processing and deep-learning software models trained to discern differences between objects and people. This makes it possible for the camera to capture highly-granular metadata – such as the color of clothing, the type of vehicle, the direction an object is traveling – which makes it easier to locate and track the movement of people and objects through a scene, whether in real-time or when searching archived footage.
Providing a Foundation for Deep Learning
How do developers train these more advanced analytics? They are taught by example. The more examples they are given for comparison, the more accurate they become. For instance, for a search algorithm to successfully locate a certain object of a precise color, it needs to be taught to recognize color and how to distinguish the differences between colors. If an algorithm is meant to trigger an alert on a particular type of vehicle, the analytics have to be programmed to discern the differences between motorcycles and bicycles, trucks and cars, buses and mobility scooters. With further training, a model could even learn to recognize and classify specific vehicle models from a chosen manufacturer.
Depending on the complexity of the application and the number of variables the algorithm needs to classify, developers might need to rely on big data sets to train their deep-learning modules. The amount of data sets needed to support the analytics generally determines where the analytics should reside. If the data set is relatively small – such as detecting whether a person is loitering or crossing into a restricted zone – the analytic can reside in-camera.
Placing analytics at the edge reduces latency and delivers greater accuracy since the video does not need to be compressed – and thus possibly lose critical details – when being transmitted to a server for analysis. But a camera’s system chip and deep-learning processor need to be sufficiently robust for the task.
For applications requiring larger data sets – like reading and classifying the differences between state license plates across the country – the analytics would likely reside on a server or in the cloud where it could efficiently compare a nationwide aggregate of motor vehicle data. Or for facial recognition analytics, the camera could capture an image of the face, but the image would likely need to be processed on a local or cloud server housing a large data set of facial images for comparison and ultimate identification.
Integrating Sensors Augments Scene Intelligence
While advanced analytics can greatly enhance situational awareness, integrating intelligent video cameras with other sensor technologies can take that scene intelligence to the next level. For instance, radar can provide another layer of context to the visual data such as the distance and speed of the person or object approaching or departing the scene.
It can provide early detection of an event and direct a camera to automatically track the person or object. Audio detection devices add acoustic intelligence such as the ability to recognize, categorize and alert to the sound of weapon fire, breaking glass, or an aggressive tone in voices. Like radar, audio detection can be used to direct cameras to the location of the sound to visually verify the event.
To achieve that interoperability, however, the physical security ecosystem needs to be built on an open standards platform and support standardized interfaces between devices and analytics. Manufacturers of security equipment usually provide open Application Program Interfaces (APIs) and Software Development Kits (SDKs) to enable these multiple data types to communicate with each other and exchange information.
Automating Alerts and Responses Based on Metadata
The ability to integrate scene intelligence from multiple devices and analytics not only helps to minimize false alarms it also generates a wealth of metadata that the analytics can use to trigger timely alerts and activate specific responses.
The decision tree for action can be programmed quite granularly based on the data the analytics are designed to detect and classify. The simplest action might be to send an event alert message to security operators or officers on patrol. Or, depending on the decision tree, place a 911 call to local responders via an integrated VoIP phone system. But the automated response could also involve triggering an action by other connected technologies in the ecosystem.
For instance, as mentioned earlier, a radar sensor could send a geolocation alert to a camera to track an intruder’s movements. A fence guard analytic could trigger floodlights or a siren if it detects a person or vehicle attempting to enter a restricted area late at night. An audio analytic could initiate an automatic lockdown of all doors when it detects the sound of gunfire. Or, an object analytic could trigger a network speaker to broadcast a pre-recorded message to move a vehicle detected blocking an emergency exit.
Metadata also plays a key role in facilitating searches through live and archived video. In addition to basic timelines, the extensive data being captured by intelligent video makes it possible to specify extremely granular parameters such as a person wearing a blue baseball cap, a green shirt and white shorts, traveling right to left across the field of view. With the advent of natural language queries, these types of searches have become much easier to conduct and yield faster, on-point results.
Applying Scene Intelligence to Unique Applications
What many companies are beginning to realize is that combining intelligent video with other intelligent sensors can serve purposes beyond physical security. In fact, just about any industry – from manufacturing to retail to aviation and education – could glean benefits from more comprehensive scene intelligence.
For instance, video analytics could be linked with thermal camera analytics and industrial temperature gauges to detect and alert on combustible materials or overheating equipment on a factory floor. Video analytics linked with vape detectors can help schools detect and identify students who are vaping on campus.
Hospitals can tie video analytics to access control technology to verify who is accessing the pharmaceuticals in a drug cabinet and trigger an alarm if there’s an identity discrepancy. Weather stations can use video analytics in conjunction with anemometers to track the speed and path of tornados and trigger emergency broadcasts through network speakers to communities in their path.
The flexibility of an open development environment makes it easier for developers to create, train and integrate advanced deep learning modules into the ecosystem to meet customers’ unique needs. With a wealth of manufacturer embedded development tools and third-party programming toolkits at their disposal, they can build their analytics from the ground up or use open-sourced libraries of deep learning modules as building blocks for their customers’ specialized applications.
Influencing Future Direction
While the fusion of advanced video analytics and other sensors can provide exceptional scene intelligence for real-time alerts and action, the wealth of metadata can also help an organization see where they have been and where they are going. The metadata generated by all these devices can be used to measure compliance or the progress of operations optimization, or any other activity along a timeline.
And gaining a handle on where they’ve been and where they are now can help institutions better decide where they should be going.
This article originally appeared in the September / October 2023 issue of Security Today.