Next Generation Video Codecs in Action


Next Generation Video Codecs in Action

Zipstream’s tactical video and situation awareness advantages

4K and Ultra High Definition (UHD) video content has approximately four times the resolution (4 times the number of actual pixels) of 1080p full HD. 1080p content has more than twice than 720p resolution. Add audio, metadata, potential multiple camera streams and you’ll soon be asking how you can deliver such high resolution over limited bandwidth connections or simply how to reduce storage costs.

Next-generation video codecs are most certainly the best opportunity, but you will quickly realize that specific implementation matched to use cases is significant.

First responders and tactical video

First responders represent a growing class of technical users that use “tactical” video for live and forensic review applications. The Video Quality in Public Safety (VQIPS) working group of the Department of Homeland Security Science and Technology Directorate (DHS S+T) provides guidance on how to deploy mission-oriented video streaming to the Next Generation First Responder.

“If we were able to track [first responder] status using technology tools, the on scene commanders would know what is happening in real time,” according to Paramedic Don MacGarry of Loudon County Fire and Rescue in Virginia.

If a police office has a tactical video stream of where an active shooter is, they will have a better chance at directing the appropriate response and saving lives. Advanced thermal imaging network cameras can also deliver temperature information in addition to the video stream. Having improved situation awareness during a fire is essential for placing assets, including what side of the building is burning and what openings are available.

Next generation codecs make it possible to deliver mission-critical video streaming at reduced bitrates. Axis Communications Zipstream technology is one example of an efficient codec that makes it possible to use higher resolution and increase forensic detail, while reducing storage cost and enabling longer recordings. The core competence of a next generation codec is the ability to enable high bit rate in scenes with significant detail in combination with low bit rate when the scene is relatively static.

Crime fighting Tactical Video Intelligence

In another city, a specialized team of law enforcement professionals just received intelligence of a new “crib” potentially nearby, where illegal narcotics and weapons are being stored. Two members of the city’s gang violence reduction unit start a tour and soon receive a “hit” off a fixed surveillance camera with a built-in license plate recognition (LPR) application and connected to the NCIC [1].

The detecting camera has a microcomputer connected to a NAS (network attached storage). A “hot list” was recently uploaded to the NAS device after the city’s crime analysis software related the vehicle registration of a known gang associate to the primary suspects.

A “beacon” application notifies the team nearby, as well as the Command Center of the alert. The camera is equipped with a “next generation” codec called Zipstream, so a clip of the vehicle passing by the camera is pushed to law enforcement on site. The camera is capable of rendering details through forensic capture, and the team sees the vehicle occupants are armed with automatic weapons.

Knowing the location of law enforcement assets, the domain awareness system (DAS) command center automatically pushed out notifications of the filtered social media chatter to the nearby detectives and tactical response team. The cloud-based social media app has been listening for known gang language, keywords and locations, focusing the intel to just what is most vital in this operation. A warrant is obtained and the unit leader requests the “go-ahead” for the operation from the DAS. A traffic management application creates a protective radius around the site operation, freezing traffic and dispatching EMS for potential injuries and HAZMAT should there be toxic drug production onsite. The team executes the warrant and takes key gang members into custody.

The Internet of Things (IoT) includes devices that can be detected on a network, be authenticated and updated. Network video surveillance cameras are examples of these IoT devices. By connecting “everything” the number IoT devices will be approximately seven times the number of people on earth today by 2020, according to Cisco.

The continuing growth in demand from subscribers for better voice, video and mobile broadband experiences is encouraging the industry to look ahead at how networks can be readied to meet future extreme capacity and performance demands.

According to Nokia [2], 10,000 times more traffic will need to be carried through all mobile broadband technologies at some point between 2020 and 2030. We made our prediction in 2010 and since then have gathered information from the market which shows that the growth we foresaw is actually happening. The need for more capacity to accommodate the demands of video streaming goes hand-in-hand with access to more spectrum on higher carrier frequencies.

We will see growth between ten and a hundred video streaming devices for each mobile communications user – even now many people have a phone, tablet, laptop and a few Bluetooth-enabled devices.

A “next generation” codec includes an encoder that offers efficient, real-time compression of video, audio and metadata for more efficient streaming, decoding and storage, ultimately taking up less disk space. The decoder extracts the audio or video information from the compressed video stream in real time or for forensic review purposes.



The High Efficiency Video Coding (HEVC) Standard, also known as h.265 was developed by JCT-VC to increase the Advanced Video Codec (AVC, h.264) compression efficiencyand endorse the development of UHD systems. Like AVC, HEVC is proprietary and usage is not free. In addition, the group HEVC Advance wants additional revenue taken in by any paid streaming-video services delivered with HEVC. It supports increased use of parallel processing architectures and effective motion vector data prediction techniques adopted to reduce bandwidth.

Mobile providers need conserve bandwidth to effectively deliver a quality mobile video experience. The replacement rate on phones is much faster than other consumer electronics, so native HEVC support is primarily focused on the smartphone industry. As of this date, Apple’s iPhones 6/6S and iPhone 6/6S Plus natively support HEVC for Facetime; Google’s operating systems, Android Intel Core i7 4790KQualcomm also include support.

HEVC encoding is highly processor intensive, which could create performance issues for network cameras that do not use efficient architectures. For example an Intel Core i7 4790K 4 GHz processor has an AVC/x264 benchmark of 52 frames per second [3], while only 15 fps with HEVC [4]. This basically means more dedicated CPU cores to accomplish the encoding. With processor manufacturers focusing on lowering power consumption as a priority, HEVC’s intensive encoding requirements may be better suited for servers or high performance network cameras dedicated to video streaming only, without additional processes like video analytics.

Encoding for the surveillance industry

AVC (h.264) and HEVC (h.265) were designed primarily for the motion picture industry. A compression technology purpose-built for the security industry was needed. Manufacturers and solution providers had to work around and overcome inefficiencies in low light rendering to avoid higher bandwidth consumption, and technologies like Lightfinder, forensic capture and wide dynamic range were developed.

Forensic Capture and wide dynamic range Figure 2 Forensic Capture and wide dynamic range

A common surveillance scenario is shown in Figure 2. The subject’s face is dark, and you have the background where the sunlight is really bright. This kind of scene has a very wide dynamic range. You can either expose the camera sensor based on the dark area, as in this picture, or you can expose based on the really bright areas, as in this picture. The two pictures do not give you a satisfying result by them selves but if you combine them you get a new image where both exposures are used, as you can see here. Visual acuity is improved, reducing noise and bit rate. Low light scenes provide a greater opportunity for bandwidth savings due to the decreased video noise.

Bit rate reduction and Zipstream

A video clip contains 25 or more frames per second. Instead of sending the entire frame every time, a video codec can save a lot of data by only sending the differences between frames. The digital multimedia content (video, audio, metadata) carrying capacity of a network connection is commonly known as bitrate or bandwidth and is measured in megabits per second. A 4G mobile connection might be capable of carrying multiple megabits per second; a home broadband connection may support ten times that amount. To encode and stream video efficiently, the bitrate or bandwidth capacity of the network connection must be greater than the bitrate of the streaming media file.

Using networked video with Variable Bit Rate (VBR) allows the quality to adapt to scene content in real-time. Using Constant Bit Rate (CBR) as a storage reduction strategy is not recommended, since cameras delivering CBR video may have to discard important forensic details in critical situations due to the bit rate limit.

Significantly more efficient implementation of AVC/h.264 video encoding, and purpose-built for surveillance applications, Zipstream analyzes and optimizes the video stream in real time to save bandwidth and storage while maintaining image quality. This technology makes it possible to continue using VBR for optimum

video quality while reducing the storage requirements. Important forensic details such as facial features, vehicle plates are preserved, while scene elements that stay constant like walls, land and other surfaces are rendered at a higher compression. In Figure 3, Zipstream rendering preserves moving scene elements represented by the green areas while static areas are rendered at a higher compression. Should movement enter the static areas, the scene rendering is adapted quickly to preserve details.

Example Zipstream Rendering Preserves Moving Scene ElementsFigure 3: Example Zipstream Rendering Preserves Moving Scene Elements

Next generation codecs like Zipstream are backward compatible with existing h.264 (AVC) decoding solutions, including video management systems (VMS), physical security information management systems (PSIM), video communications (VC) solutions, Axis Camera Management and Axis Camera Companion Software. What is interesting to note is that the algorithm to compress video according to the h.264 codec is not standardized by the video industry, only the syntax and the method to do decoding and playback.

This is intentional and enables improved H.264 encoding solutions to be created while keeping the file format for interoperability, ease of forensic review and player compatibility.

The following three examples illustrate Zipstream improvement (bitrate reduction) over AVC for a constant image quality:

  1. Zipstream comparison: outdoor city view, medium activity, low light, 720p resolution (Figure 4)
  2. Zipstream comparison: outdoor parking area with high activity due to snow and vehicles, daylight, 720p resolution (Figure 5)
  3. Zipstream comparison: outdoor pedestrian path, low activity, daylight, 720p resolution

Zipstream comparison: outdoor city view, medium activity, low light, 720p resolution Figure 4: Zipstream comparison: outdoor city view, medium activity, low light, 720p resolution

Zipstream comparison: outdoor parking area with high activity due to snow and vehicles, daylight, 720p resolution Figure 5: Zipstream comparison: outdoor parking area with high activity due to snow and vehicles, daylight, 720p resolution

Zipstream comparison: outdoor pedestrian path, low activity, daylight, 720p resolution Figure 6: Zipstream comparison: outdoor pedestrian path, low activity, daylight, 720p resolution

So how do the next generation codecs compare?

HEVC Verification tests were conducted by the Joint Collaborative Team on Video Coding of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29. The paper “Video Quality Evaluation Methodology and Verification Testing of HEVC Compression Performance” [5] presents the subjective and objective results of a verification test in which the performance of HEVC and AVC.

The test used video sequences with resolutions ranging from 480p up to UHD, encoded at various quality levels using the HEVC Main profile and the AVC High profile. The tests showed that bit rate savings of 59% on average can be achieved by HEVC for the same perceived video quality; however, it has been shown that the bit rates required to achieve good quality of compressed content, as well as the bit rate savings relative to AVC, are highly dependent on the characteristics of the tested content.

Since there is also data comparing Zipstream’s bitrate savings to AVC, and several scenes in the IEEE study are similar, we are able to provide an approximate comparison between HEVC, AVC and Zipstream. The results are presented in the table “Comparison of HEVC bit rate savings over AVC and Zipstream bit rate savings over AVC.”

We can see that both Zipstream and HEVC have approximately the same savings for the three Zipstream use cases, similar to the HEVC use cases. Since Zipstream provides backward compatibility with existing video management solutions and smartphones, bitrate reduction, less processor encoding resources, multiple parameter configuration and avoids the threat of additional HEVC licensing, it represents a significant opportunity for tactical and forensic video professionals to significant performance increases.

[1] The Federal Bureau of Investigation (FBI) compares license plates against its National Crime Information Center (NCIC) database. As law enforcement agencies take advantage of advanced technologies, the opportunities for using data to help with investigations increases greatly.

LPR cameras locate license plates within an image, decode them using automatic number plate recognition and character recognition.

[2] “5G use cases and requirements,” Nokia Networks FutureWorks, 2014

[3] X264 Benchmark

[4] X265 Benchmark

[5] Video Quality Evaluation Methodology and Verification Testing of HEVC Compression Performance: link here

  • Environmental Protection
  • Occupational Health & Safety
  • Infrastructure Solutions Group
  • Spaces4Learning
  • Campus Security & Life Safety