How libavif Handles Depth Maps and Auxiliary Images
This article explains how the libavif library decodes
and manages depth maps and auxiliary image items embedded in AVIF files.
It covers the underlying container structure, how the library identifies
auxiliary data during parsing, and the API mechanisms used to extract
these layers for rendering and image processing.
The Role of Auxiliary Images in AVIF
AVIF (AV1 Image File Format) is built on top of the HEIF (High Efficiency Image File Format) standard, which uses the ISO Base Media File Format (ISOBMFF) container. Within this container, a file can store more than just a single primary image. It can contain auxiliary image items, which are secondary images intended to provide complementary data to the primary image.
Common examples of auxiliary images include: * Alpha Channels: Used for transparency and matte effects. * Depth Maps: Used for 3D depth estimation, portrait mode post-processing, and spatial effects. * Gain Maps: Used for HDR-to-SDR tone mapping.
Container Structure and Identification
To handle these items, libavif parses the container’s
metadata boxes, specifically looking for relationships between image
items.
- Item References (
iref): Auxiliary images are linked to the primary image item using anauxl(auxiliary list) reference type within the Item Reference Box. This tells the parser that the secondary image is not a standalone picture but is metadata associated with the main image. - Auxiliary Type Property (
auxC): Every auxiliary image item is associated with anauxCproperty box. This box contains a Uniform Resource Identifier (URI) that defines the type of auxiliary data. For example, an alpha channel uses a specific standard URI, while depth maps use URIs such asurn:mpeg:hevc:2015:auxid:1or custom URIs defined by specific camera manufacturers.
How libavif Processes Alpha Channels
Because transparency is a core requirement for web graphics,
libavif treats the alpha channel auxiliary item as a
first-class citizen.
During the decoding process, if libavif detects an
auxiliary item of type alpha linked to the primary item, it
automatically decodes it alongside the primary YUV color channels. The
library then populates the alphaPlane and
alphaRowBytes members of the main avifImage
structure. This integrates the auxiliary alpha data directly into the
standard image buffer for easy access by the calling application.
How libavif Handles Depth Maps and Other Auxiliary Items
Unlike alpha channels, depth maps and other custom auxiliary images
are not automatically merged into the main color planes of the
avifImage structure. Instead, libavif handles
them through its lower-level item parsing API.
1. Parsing the Meta Box
When avifDecoderParse() is called, the library reads the
file’s meta box and indexes all items. It identifies the
primary item and traverses the iref boxes to locate any
associated auxiliary items.
2. Extracting Auxiliary Properties
For each auxiliary item found, libavif reads the
auxC box to extract the auxiliary type URI. This allows the
application using libavif to query the file and determine
if a depth map is present by checking for depth-specific URIs.
3. Independent Decoding
To access the depth map, the developer must instruct the decoder to
target the specific auxiliary item ID. libavif decodes the
auxiliary item as a separate, monochrome image (typically 8-bit or
16-bit single-channel YUV, where the Y channel represents the depth
values). This decoded data is returned in a separate
avifImage structure, preserving the depth resolution and
spatial coordinates of the map.