How libavif Handles HEIF Image Structure

This article explains how the libavif library processes the High Efficiency Image File Format (HEIF) structure to read and write AVIF images. It covers how the library parses the underlying container, manages metadata and color profiles, extracts compressed AV1 payloads, and packages the final image files for distribution.

The Relationship Between AVIF and HEIF

AV1 Image File Format (AVIF) is a specific profile of the broader High Efficiency Image File Format (HEIF) standard. HEIF itself is based on the ISO Base Media File Format (ISOBMFF).

While HEIF defines how container boxes store image data, grid items, and metadata, it does not define the compression codec. libavif acts as the bridge between this HEIF container structure and the AV1 video codec, translating HEIF boxes into raw image data and vice versa.

Parsing the ISOBMFF Container

At its core, libavif contains a lightweight parser designed specifically to navigate the nested “box” (or “atom”) structure of ISOBMFF. When reading an AVIF file, the library parses several key boxes:

ftyp (File Type Box): libavif first reads this box to verify the major brand (such as avif or avis for animated sequences) and ensure compatibility.
meta (Metadata Box): This container holds information about the image structure, including item locations, properties, and relationships.
iloc (Item Location Box): libavif queries this box to find the exact byte offsets and lengths of the compressed image payloads stored in the file.
iinf (Item Information Box): This box identifies the types of items present (e.g., primary image, alpha auxiliary channel, or thumbnail).

By targeting only the boxes necessary for AVIF, libavif maintains a small footprint and avoids the overhead of a full-scale HEIF parser.

Managing Image Properties and Color Profiles

Image properties are stored in the iprp (Item Properties Box) inside the container. libavif parses these properties to configure the canvas before decoding the pixel data:

ispe (Image Spatial Extents): Defines the width and height of the image.
pixi (Pixel Information): Defines the bit depth (8, 10, or 12 bits) and channel configurations.
colr (Color Information): Contains ICC profiles or NCLX (CICCP) color coordinates, which libavif extracts to ensure accurate color rendering.

If an image is stored as a grid of smaller derived images (often used for very large resolutions), libavif reads the iref (Item Reference Box) to reconstruct the layout and seamlessly stitch the tiles back together.

Extracting and Decoding the AV1 Payload

Once libavif understands the container structure and layout, it extracts the raw compressed AV1 bitstream from the mdat (Media Data Box).

Because libavif is a container library and not a codec, it does not decode the raw AV1 bitstream itself. Instead, it passes the extracted payload to an external AV1 decoder library (such as dav1d or aom). Once the decoder outputs the raw YUV or RGB pixels, libavif applies any auxiliary data—such as alpha transparency channels, which are stored as a separate auxiliary image item within the HEIF structure.

Encoding and Writing HEIF Structures

When writing an AVIF file, libavif reverses this entire process. It takes raw pixel input, compresses it using an external AV1 encoder (such as aom, rav1e, or svt-av1), and then wraps the compressed bitstream into a compliant HEIF container.

The library constructs the nested ISOBMFF boxes, generates the correct properties (like color profiles and spatial extents), maps the alpha channel if present, and writes the final bytes as a valid HEIF-compatible .avif file.