Video processing is a key technology of the modern world; it enables electronic systems to capture, process, and extract data contained within videos. Video processing, therefore, is the foundation technology for many applications, from smart city traffic management to broadcast.
All these applications require the capacity to process high-resolution frames—for example, 4K or 8K resolution with a frame rate of 60 frames per second or beyond. This equates to processing 500 million pixels per second at 4K resolution or 1990 million pixels per second at 8K resolution. These are challenging performance figures for a simple video capture and display pipeline which just displays the received video. When additional processing steps are needed—for example, to detect and classify an object or perform transcoding—the processing requirements necessary to achieve the frame rate are considerable. This is especially true if the video analysis is time-critical, such as in smart cities traffic monitoring deployments where advanced algorithms predict and smooth traffic flow using artificial intelligence and machine learning. Implementation of these algorithms can easily result in application bottlenecks, which significantly impact performance, especially when processing requires multiple operations on a pixel or arrays of pixels.
Creating complex video processing systems extends beyond the need for pure processing capability and also requires high I/O count to interface with a diverse array of external sensors, cameras, and actuators. For a smart city traffic management system, this interfacing might mean supporting multiple video sensors, while providing high-performance network interfacing and local storage/recording of critical events using JPEG XS. To give another example, consider a medical-surgical robotics system that relies on video processing. This system will have to interface with sensors, while simultaneously control illumination, and provide fine control over a range of motors and actuators. For both of these applications, the interfacing challenges can be significant. High industry demand exists for devices with the performance that can support multiple high-speed sensors and provide the interfacing capabilities needed to communicate with a wide range of networking and industrial interfaces.
The Role of FPGAs in Video Processing
One of the leading technologies used by system design engineers to address these performance and interfacing challenges is Field Programmable Gate Arrays (FPGAs). FPGAs present the designer logic resources that can be used to implement highly parallel pipelined processing structures. Like the flexible nature of the internal fabric, the I/O structures of FPGAs are also very flexible, which enables the interfacing of both high-speed and low-speed interfaces. This flexibility allows the FPGA to support several high-performance video sensors and networking interfaces and implement low-bandwidth industrial, legacy, and custom interfaces used to control actuators, sensors, motors, and other external devices.
The implementation of video processing algorithms in logic enables the creation of deeply parallelized implementations. These parallel implementations increase determinism and reduce latency as bottlenecks in the processing system can be removed.
Selecting the FPGA
Of course, the choice of FPGA will vary between applications to ensure the most effective solution. Design engineers select devices based upon logic capacity and performance, interfacing capabilities, and specialized hard macros. For example, devices from the Intel Arria 10 family are often selected for medical and Pro A/V video processing applications, while devices in the Stratix 10 family are well-suited for broadcast solutions. In addition to high-performance logic, the Arria 10 family provides developers with a range of high-bandwidth interconnectivity solutions with the GT and GX families, while the SX family provides Arm A9 processors that enable the implementation of sequential processing, such as human-machine interfaces (HMI), GUIs, communication protocols, etc.
The Intel Stratix 10 families provide a significant step change in capability, offering embedded Arm A53 cores in SX devices, high-performance floating point and throughput solutions in GX devices, and support for AI/ML in NX devices. This wide variety of devices to select from allows the developer to select the most appropriate FPGA for the application at hand.
Regardless of the device selected, design engineers need a wide range of production-ready IP to meet increasingly demanding project timescales.
Within Intel Quartus Prime Design Software developers can make use of Intel’s comprehensive Video and Image Processing Suite. This suite features twenty-plus highly optimized, ready-for-production IP blocks, which offer the necessary core functionalities to implement video and image processing pipelines. To enable a high-performance integration of and connection between the VIP suite cores, the IP blocks connect using Intel’s Avalon streaming interface. This enables a mix-and-match approach to the use of the video IP blocks, with blocks being inserted into the video processing pipeline as required. The video IP provides the design engineer with a range of capabilities, including:
- Interfacing: Support for a range of different camera and sensor interfaces from HDMI to SDI, DisplayPort, MIPI, and Ethernet (GigE Vision)
- Capture, Correction, and Processing: Ability to format the video as required for processing—for example, color space conversion, de-interlacing, gamma correction, clipping, chroma resampling, synchronization—and remove temporal and spectral noise from the video using 2D Filter and the video stream cleaner.
- Formatting: The ability to format an output video using Alpha blending, scaling, and interlacing.
- Buffering: Support for reading and writing frame buffers in DDR. This enables the developer to change input and output frame rates and make the processed video available to the processor system for high-level video processing.
- Analytics and Test: Support for on-the-fly video statistics and test pattern generation to enable the video-processing path without the sensor/camera being present.
While the Video and Image Processing Suite is extensive, other specialist IP functionality might be required. In this instance, the developer can leverage a large range of partner ecosystem IP. Such IP partners include IntoPIX, who provide a range of compression IP including JPEG-XS, Rambus (previously Northwest Logic), who provide MIPI interfacing solutions, and Macnica, who provide a range of video over IP solutions.
This wide range of Intel® and partner ecosystem IP enables developers to develop a custom video processing application quickly and easily. For custom algorithm implementations, the developer can leverage Intel’s HLS compiler. HLS compiler allows developers to define the algorithm using a higher level language, further reducing the design and verification time compared to a Register Transfer Level Implementation.
Conclusion
Creating modern video processing applications capable of supporting 4K and 8K resolution requires both significant processing and interfacing capabilities. The wide range of Intel and Partner ecosystem video processing and connectivity IP allows designers to pick and choose functionality, while the high-performance FPGA fabric is ideal for processing high-resolution video streams. These features, along with a robust software design flow, provide for rapid development of next-generation intelligent video applications.