How libavif Handles HEIF Image Structure
This article explains how the libavif library processes
the High Efficiency Image File Format (HEIF) structure to read and write
AVIF images. It covers how the library parses the underlying container,
manages metadata and color profiles, extracts compressed AV1 payloads,
and packages the final image files for distribution.
The Relationship Between AVIF and HEIF
AV1 Image File Format (AVIF) is a specific profile of the broader High Efficiency Image File Format (HEIF) standard. HEIF itself is based on the ISO Base Media File Format (ISOBMFF).
While HEIF defines how container boxes store image data, grid items,
and metadata, it does not define the compression codec.
libavif acts as the bridge between this HEIF container
structure and the AV1 video codec, translating HEIF boxes into raw image
data and vice versa.
Parsing the ISOBMFF Container
At its core, libavif contains a lightweight parser
designed specifically to navigate the nested “box” (or “atom”) structure
of ISOBMFF. When reading an AVIF file, the library parses several key
boxes:
ftyp(File Type Box):libaviffirst reads this box to verify the major brand (such asaviforavisfor animated sequences) and ensure compatibility.meta(Metadata Box): This container holds information about the image structure, including item locations, properties, and relationships.iloc(Item Location Box):libavifqueries this box to find the exact byte offsets and lengths of the compressed image payloads stored in the file.iinf(Item Information Box): This box identifies the types of items present (e.g., primary image, alpha auxiliary channel, or thumbnail).
By targeting only the boxes necessary for AVIF, libavif
maintains a small footprint and avoids the overhead of a full-scale HEIF
parser.
Managing Image Properties and Color Profiles
Image properties are stored in the iprp (Item Properties
Box) inside the container. libavif parses these properties
to configure the canvas before decoding the pixel data:
ispe(Image Spatial Extents): Defines the width and height of the image.pixi(Pixel Information): Defines the bit depth (8, 10, or 12 bits) and channel configurations.colr(Color Information): Contains ICC profiles or NCLX (CICCP) color coordinates, whichlibavifextracts to ensure accurate color rendering.
If an image is stored as a grid of smaller derived images (often used
for very large resolutions), libavif reads the
iref (Item Reference Box) to reconstruct the layout and
seamlessly stitch the tiles back together.
Extracting and Decoding the AV1 Payload
Once libavif understands the container structure and
layout, it extracts the raw compressed AV1 bitstream from the
mdat (Media Data Box).
Because libavif is a container library and not a codec,
it does not decode the raw AV1 bitstream itself. Instead, it passes the
extracted payload to an external AV1 decoder library (such as
dav1d or aom). Once the decoder outputs the
raw YUV or RGB pixels, libavif applies any auxiliary
data—such as alpha transparency channels, which are stored as a separate
auxiliary image item within the HEIF structure.
Encoding and Writing HEIF Structures
When writing an AVIF file, libavif reverses this entire
process. It takes raw pixel input, compresses it using an external AV1
encoder (such as aom, rav1e, or
svt-av1), and then wraps the compressed bitstream into a
compliant HEIF container.
The library constructs the nested ISOBMFF boxes, generates the
correct properties (like color profiles and spatial extents), maps the
alpha channel if present, and writes the final bytes as a valid
HEIF-compatible .avif file.