Mauricio Álvarez Mesa


Contact information
E-N 601/602
+49 (0)30 314-21357
+49 (0)30 314-22943

Office hours:
with appointment
Sekretariat EN 12
Einsteinufer 17
D-10587 Berlin



GPU Parallelization of HEVC In-Loop Filters
Citation key Wang2017
Author Biao Wang and Diego F. de Souza and Mauricio Alvarez-Mesa and Chi Ching Chi and Ben Juurlink and Aleksandar Ilic and Nuno Roma and Leonel Sousa
Pages 1–21
Year 2017
ISSN 1573-7640
DOI 10.1007/s10766-017-0488-z
Journal International Journal of Parallel Programming
Abstract In the High Efficiency Video Coding (HEVC) standard, multiple decoding modules have been designed to take advantage of parallel processing. In particular, the HEVC in-loop filters (i.e., the deblocking filter and sample adaptive offset) were conceived to be exploited by parallel architectures. However, the type of the offered parallelism mostly suits the capabilities of multi-core CPUs, thus making a real challenge to efficiently exploit massively parallel architectures such as Graphic Processing Units (GPUs), mainly due to the existing data dependencies between the HEVC decoding procedures. In accordance, this paper presents a novel strategy to increase the amount of parallelism and the resulting performance of the HEVC in-loop filters on GPU devices. For this purpose, the proposed algorithm performs the HEVC filtering at frame-level and employs intrinsic GPU vector instructions. When compared to the state-of-the-art HEVC in-loop filter implementations, the proposed approach also reduces the amount of required memory transfers, thus further boosting the performance. Experimental results show that the proposed GPU in-loop filters deliver a significant improvement in decoding performance. For example, average frame rates of 76 frames per second (FPS) and 125 FPS for Ultra HD 4K are achieved on an embedded NVIDIA GPU for All Intra and Random Access configurations, respectively.
Link to publication Download Bibtex entry



Mauricio Alvarez Mesa is currently a postdoctoral researcher at the Embedded Systems Architecture group at TU Berlin. He received the MSc degree in Electronic Engineering in 2000 from University of Antioquia, Medellin, Colombia, and the PhD degree in Computer Science in 2011 from Universitat Politècnica de Catalunya (UPC), Barcelona, Spain. From 2006 to 2011 he was an adjunct lecturer at UPC. He was a summer intern at IBM Haifa Research labs, Israel in 2007, and a research visitor at Technische Universität Berlin (TU Berlin), Berlin, Germany in 2011. From 2012 to 2013 he was a research associate at the Multimedia Communications group of Fraunhofer Institute HHI in Berlin. At TU Berlin he is currently leading the LPGPU European project and the High Performance Video Coding research line. He has co-authored more than 20 publications in the field of video coding, parallel computing and computer architecture.

