direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

High performance video coding


H.264 is the current standard for video coding, as it offers substantial improvements in compression rate and quality. However, as a result, the computational complexity of H.264 is much higher than that of previous standards. This trend will continue with HEVC, the future standard that will succeed H.264. To be able to perform real-time decoding and encoding, multicores need to be leveraged.

Parallel video decoding is challenging, however. The encoding algorithms focus on reducing redundant information as much as possible, which leads to complex dependencies between decoding kernels. Research in this field has yielded a fully parallel Quad HD/4K capable decoder, which is able to run with high performance on a wide range of parallel platforms.

Research in this area include investigating the possibilities of parallelizing HEVC, investigating the use of GPUs for parallel decoding and encoding, implementing a QHD H.264 demo video setup and player interface, and optimizing the video coding kernels for SIMD, VLIW, and TTA architectures.

h.264 OpenCL decoder

To employ the power of GPUs for massive parallel processing, we proposed this work to offload parallel kernels in H.264 decoding. The h.264 decoding can be divided into two main stages, entropy decoding and macroblock reconstruction. The latter, in turn, can be divided into four kernels, intra/inter prediction, inver transform, and deblocking filter. The OpenCL h.264 decoder offloads inverse transform and motion compensation onto OpenCL devices. To use OpenCL, you need OpenCL supported device, driver, and corresponding OpenCL SDK installed.

For more Information about OpenCL see www.khronos.org/opencl/ .

The most recent version of the h.264 OpenCL decoder can be downloaded here (GZ, 394,5 KB)  as a tarball.

Instructions how to install and use the decoder are included in the package. h.264 OpenCL decoder is developed by the AES LPGPU team under the lead of Prof. Dr. Ben Juurlink. This project receives funding from the European Community's Seventh Framework Programme [FP7/2007-2013] under the LPGPU Project (www.lpgpu.org), grant agreement n°288653.

If you have any questions regarding h.264 OpenCL decoder, please  write a mail to:

Video coding using HW/SW codesign

As pure software solutions often cannot unleash the full potential of modern video coding standards, SoCs with dedicated hardware components are used in many mass-market multimedia devices. To reduce the cost of production of these devices as well as their energy consumption it is crucial to find the right partitioning between hardware and software implementations and to apply efficient interconnect technologies. Our group therefore investigates methods for high-speed, low-cost encoding as well as decoding of video streams by using modern HW/SW codesign techniques and state-of-the-art FPGAs and embedded processors.


Chi Ching Chi and Ben Juurlink and C.H. Meenderinck (2010). Evaluation of Parallel H.264 Decoding Strategies for the Cell Broadband Engine. Proceedings International Conference on Supercomputing (ICS)

Chi Ching Chi and Ben Juurlink (2011). A QHD-Capable Parallel H.264 Decoder. Proceedings 25th International Conference on Supercomputing

Arnaldo Azevedo and Ben Juurlink and Cor Meenderinck and Andrei Terechko and Jan Hoogerbrugge and Mauricio Alvarez Mesa and Alex Ramírez and Mateo Valero (2011). A Highly Scalable Parallel Implementation of H.264. T. HiPEAC, 111-134.

Mauricio Alvarez Mesa and Chi Ching Chi and Ben Juurlink and V. George and T. Schierl (2012). Parallel Video Decoding in the Emerging HEVC Standard. Proceedings of the 37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012

Chi Ching Chi and Mauricio Alvarez Mesa and Ben Juurlink and V. George and T. Schierl (2013). Improving the Parallelization Efficiency of HEVC Decoding. Proceedings of the 2012 International Conference on Image Processing (ICIP)

Ben Juurlink and Mauricio Alvarez-Mesa and Chi Ching Chi and Arnaldo Azevedo and Cor Meenderinck and Alex Ramirez (2012). Scalable Parallel Programming Applied to H.264/AVC Decoding. Springer.

Chi Ching Chi and Mauricio Alvarez-Mesa and Jan Lucas and Ben Juurlink and T. Schierl (2013). Parallel HEVC Decoding on Multi- and Many-core Architectures. A Power and Performance Analysis.. Journal of Signal Processing Systems

Biao Wang and Mauricio Alvarez Mesa and Chi Ching Chi and Ben Juurlink (2013). An Optimized Parallel IDCT on Graphics Processing Units. Lecture Notes in Computer Science

Chi Ching Chi and Mauricio Alvarez Mesa and Ben Juurlink and Clare, G. and Henry, F. and Pateux, S. and Schierl, T. (2012). Parallel Scalability and Efficiency of HEVC Parallelization Approaches. IEEE Transactions on Circuits and Systems for Video Technology

Benjamin Bross and Mauricio Alvarez-Mesa and Valeri George and Chi Ching Chi and Tobias Mayer and Ben Juurlink and Thomas Schierl (2013). HEVC real-time decoding. Applications of Digital Image Processing XXXVI. Proceedings of SPIE

Benjamin Bross and Valeri George and Mauricio Alvarez-Mesa and Tobias Mayer and Chi Ching Chi and Jens Brandenburg and Thomas Schierl and Detlev Marpe and Ben Juurlink (2013). HEVC Performance and Complexity for 4K Video. IEEE Third International Conference on Consumer Electronics - Berlin (ICCE-Berlin)

Matthias Göbel (2014). A High-Performance Hardware Accelerator for HEVC Motion Compensation. Lecture Notes in Informatics - Seminars, Informatiktage 2014, 209-212.

Philipp Habermann (2014). Design and Implementation of a High-Throughput CABAC Hardware Accelerator for the HEVC Decoder. Lecture Notes in Informatics - Seminars, Informatiktage 2014, 213-216.

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions