Handling Multiple Image Items in libavif
This article explains how the libavif library processes AVIF files containing multiple independent image items. It details how the decoder identifies the primary image, how it distinguishes between image sequences and static collections, and what happens to secondary or alternate images during the decoding process.
In the AVIF specification—which is built upon the High Efficiency Image File Format (HEIF) and the ISO Base Media File Format (ISOBMFF)—a single file can package multiple independent image items. These items can represent alternate resolutions, different crops, or entirely distinct images grouped together.
Primary Item Identification
When libavif parses an AVIF file containing multiple independent
items, its default behavior is guided by the pitm (Primary
Item) box in the file’s metadata.
- The Default Action: When calling
avifDecoderParse(), libavif scans the container and locates the designated primary item ID. - Decoding: The standard decoding pipeline targets
this primary image. Any subsequent call to
avifDecoderNextImage()decodes this primary item, along with any of its associated auxiliary items, such as an alpha channel (transparency) or a depth map.
Image Sequences vs. Image Collections
libavif distinguishes between multiple images based on how they are structured within the container:
- Image Sequences (Animations): If the multiple
images are stored as a sequence (using a track or
trakbox), libavif treats them as frames of an animation. The developer can iterate through each frame sequentially usingavifDecoderNextImage(), retrieving the duration and image data for each frame. - Image Collections (Static Alternates): If the images are stored as independent, static items (an image collection) rather than a timed sequence, libavif does not automatically loop or cycle through them. By default, it only exposes and decodes the primary item.
Accessing Non-Primary Independent Items
If an AVIF file contains multiple independent static images and you need to access an item other than the primary one, the high-level libavif decoder API has specific design limitations:
- Targeted Use Case: libavif is primarily optimized for the “primary image + auxiliary channels” or “animated image sequence” use cases.
- API Exposure: The high-level API does not provide a direct, simple array interface to iterate and decode unrelated, independent static images.
- Alternative Approaches: To extract independent,
non-primary static images (such as a secondary image or an unlinked
thumbnail), developers must either use a broader container-parsing
library (such as
libheif) or use lower-level container-parsing functions to identify the item IDs and manually direct the decoder to those specific item IDs.