How libavif Handles HDR Image Data
This article explores how the libavif library processes
High Dynamic Range (HDR) image data. It explains the mechanisms
libavif uses to encode and decode HDR content, including
its reliance on AV1 color profiles, Coding-Independent Code Points
(CICP), and high bit-depth support, ensuring high-fidelity rendering
across compatible displays.
The AV1 Foundation and CICP
Because the AVIF format is a container for AV1 keyframes,
libavif inherits the robust color management capabilities
of the AV1 video codec. To handle HDR, libavif relies on
Coding-Independent Code Points (CICP) as defined in the ISO/IEC
23091-2/ITU-T H.273 standard.
CICP specifies three critical parameters that libavif
embeds into the image metadata: * Color Primaries:
Defines the color gamut. For HDR, this is typically BT.2020, which
offers a much wider color spectrum than the standard sRGB (BT.709). *
Transfer Characteristics: Defines the opto-electronic
transfer function (OETF). For HDR, libavif supports
Perceptual Quantizer (PQ, SMPTE ST 2084) and Hybrid Log-Gamma (HLG, ARIB
STD-B67). * Matrix Coefficients: Dictates how RGB data
is translated into YUV luma and chroma channels for compression.
By writing these integer coordinates directly into the AVIF
container, libavif signals to the decoder exactly how to
interpret the luminance and color span of the pixel data.
High Bit-Depth Support
Standard Dynamic Range (SDR) images typically use 8 bits per channel. HDR images require greater precision to prevent visual artifacts like color banding, especially in smooth gradients like skies.
The libavif library natively supports 10-bit and 12-bit
color depths. When encoding an HDR image, libavif preserves
the precision of the source high-dynamic-range input (such as a 16-bit
half-float EXR or PNG) by converting it to a 10-bit or 12-bit YUV or RGB
representation before compressing it via the underlying AV1 encoder
(such as aom, rav1e, or SVT-AV1).
HDR Metadata Preservation
Beyond basic color space configuration, HDR relies on metadata to
guide displays on how to tone-map content based on the screen’s physical
capabilities. libavif handles two primary types of HDR
metadata:
Content Light Level (CLL) and Mastering Display Color Volume (MDCV)
libavif can parse and write static HDR metadata
structures: * MDCV (SMPTE ST 2086): Describes the color
primaries, white point, and luminance range of the monitor used to
master the image. * CLL: Describes the Maximum Content
Light Level (MaxCLL) and Maximum Frame Average Light Level (MaxFALL) in
nits.
This metadata is packed into the AV1 bitstream, allowing compatible web browsers and operating systems to scale the image brightness appropriately without clipping the highlights.
Legacy ICC Profiles
While CICP is the modern standard for HDR rendering on screens,
libavif also allows the inclusion of ICC profiles. If an
ICC profile is present alongside CICP coordinates, conforming decoders
will prioritize the CICP properties for HDR rendering but can fall back
to the ICC profile if the system does not support native HDR display
pipelines.
Encoding and Decoding Workflow
During the encoding process, the user provides libavif
with raw pixel data and specifies the HDR parameters (such as BT.2020
primaries and PQ transfer function). The library configures the AV1
encoder to use the appropriate profile (typically Main 10) and writes
the metadata payload into the AVIF container wrapper.
During decoding, libavif extracts the compressed AV1
bitstream and parses the container’s metadata. It outputs the raw pixels
in 10-bit or 12-bit formats, alongside the parsed CICP and luminance
metadata. The rendering application or operating system’s graphics
pipeline then uses this information to map the colors accurately to the
user’s HDR-capable display.