Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register. Have you forgotten your password?
Repository logo
  • Colleges, Institutes & Collections
  • Browse AAU-ETD
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Michael Girma"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    Acceleration of H.266 Encoding Using OPENCL And Vectorization with Block Size Variation
    (Addis Ababa University, 2025-06) Michael Girma; Fitsum Assamnew (PhD)
    Versatile Video Coding (H.266) achieves approximately a 50% reduction in bitrate compared to its predecessor. However, this improvement in compression efficiency comes with a significant increase in computational complexity, presenting major challenges for real-time encoding on general-purpose processors. Most existing H.266 (VVC) implementations rely heavily on CPU-only processing or on vendor specific GPU solutions such as CUDA, which limits portability and cross platform compatibility. Moreover, these approaches often fail to fully utilize modern heterogeneous CPU-GPU architectures, leaving substantial performance potential unexploited. This work proposes an OpenCL-based H.266 encoding solution aimed at delivering high performance, broad cross-platform support, and efficient hardware utilization. Key encoding modules including block partitioning, prediction, transform and quantization, loop filtering, and entropy coding—are implemented as OpenCL kernels to leverage task-level parallelism across both CPUs and GPUs. Additionally, AVX and SSE vectorization techniques are applied on the CPU side to enhance per-core throughput, particularly in compute intensive operations such as transform and quantization. Experimental results across various platforms demonstrate significant performance improvements. On an NVIDIA V100 GPU, the OpenCL-accelerated encoder achieves speedups of up to 7500× compared to a sequential implementation running on an Intel Xeon E5-2698 v4, with peak efficiency observed at a block size of 512×512. Tests conducted on an Intel UHD 620 GPU and an Intel i5-8265U CPU reveal speedups ranging from 15.5× to 370×, depending on the block size. The findings suggest that medium block sizes (64×64 to 256×256) strike the best balance between computational efficiency and workload distribution. While AVX provides only modest gains over SSE, the primary performance bottleneck lies in memory access speed rather than computational power. Overall, the proposed OpenCL-based implementation significantly accelerates H.266 encoding while maintaining high compression quality.

Home |Privacy policy |End User Agreement |Send Feedback |Library Website

Addis Ababa University © 2023