How libavif Handles Depth Maps and Auxiliary Images

This article explains how the libavif library decodes and manages depth maps and auxiliary image items embedded in AVIF files. It covers the underlying container structure, how the library identifies auxiliary data during parsing, and the API mechanisms used to extract these layers for rendering and image processing.

The Role of Auxiliary Images in AVIF

AVIF (AV1 Image File Format) is built on top of the HEIF (High Efficiency Image File Format) standard, which uses the ISO Base Media File Format (ISOBMFF) container. Within this container, a file can store more than just a single primary image. It can contain auxiliary image items, which are secondary images intended to provide complementary data to the primary image.

Common examples of auxiliary images include: * Alpha Channels: Used for transparency and matte effects. * Depth Maps: Used for 3D depth estimation, portrait mode post-processing, and spatial effects. * Gain Maps: Used for HDR-to-SDR tone mapping.

Container Structure and Identification

To handle these items, libavif parses the container’s metadata boxes, specifically looking for relationships between image items.

Item References (iref): Auxiliary images are linked to the primary image item using an auxl (auxiliary list) reference type within the Item Reference Box. This tells the parser that the secondary image is not a standalone picture but is metadata associated with the main image.
Auxiliary Type Property (auxC): Every auxiliary image item is associated with an auxC property box. This box contains a Uniform Resource Identifier (URI) that defines the type of auxiliary data. For example, an alpha channel uses a specific standard URI, while depth maps use URIs such as urn:mpeg:hevc:2015:auxid:1 or custom URIs defined by specific camera manufacturers.

How libavif Processes Alpha Channels

Because transparency is a core requirement for web graphics, libavif treats the alpha channel auxiliary item as a first-class citizen.

During the decoding process, if libavif detects an auxiliary item of type alpha linked to the primary item, it automatically decodes it alongside the primary YUV color channels. The library then populates the alphaPlane and alphaRowBytes members of the main avifImage structure. This integrates the auxiliary alpha data directly into the standard image buffer for easy access by the calling application.

How libavif Handles Depth Maps and Other Auxiliary Items

Unlike alpha channels, depth maps and other custom auxiliary images are not automatically merged into the main color planes of the avifImage structure. Instead, libavif handles them through its lower-level item parsing API.

1. Parsing the Meta Box

When avifDecoderParse() is called, the library reads the file’s meta box and indexes all items. It identifies the primary item and traverses the iref boxes to locate any associated auxiliary items.

2. Extracting Auxiliary Properties

For each auxiliary item found, libavif reads the auxC box to extract the auxiliary type URI. This allows the application using libavif to query the file and determine if a depth map is present by checking for depth-specific URIs.

3. Independent Decoding

To access the depth map, the developer must instruct the decoder to target the specific auxiliary item ID. libavif decodes the auxiliary item as a separate, monochrome image (typically 8-bit or 16-bit single-channel YUV, where the Y channel represents the depth values). This decoded data is returned in a separate avifImage structure, preserving the depth resolution and spatial coordinates of the map.