How libavif Uses dav1d for AVIF Decoding

This article explains how the libavif library utilizes the highly optimized dav1d decoder to translate compressed AV1 bitstreams into renderable AVIF images. It covers the structural relationship between the AVIF container and the AV1 codec, the step-by-step decoding pipeline, and why dav1d is the preferred decoding engine for modern AVIF image processing.

The Relationship Between AVIF, libavif, and dav1d

To understand how libavif uses dav1d, it is essential to understand what an AVIF file actually is. AVIF (AV1 Image File Format) is not a brand-new image compression algorithm; rather, it wraps highly compressed AV1 video keyframes inside an ISOBMFF (ISO Base Media File Format) container.

Because of this hybrid structure, decoding an AVIF image requires two distinct operations: 1. Container Parsing: Reading the file structure, extracting metadata (such as Exif, XMP, and ICC color profiles), and locating the raw compressed image data. 2. Image Decoding: Decompressing the raw AV1 video bitstream into raw pixel data.

The libavif library is responsible for the first task (parsing the container). However, libavif does not actually contain an AV1 decoder. Instead, it acts as a wrapper and delegates the complex, CPU-intensive task of decompressing the AV1 bitstream to an external AV1 decoder. dav1d, developed by the VideoLAN project, is the primary and most popular decoder library used by libavif for this purpose.

The Step-by-Step Decoding Workflow

When an application requests libavif to decode an AVIF image, the library coordinates with dav1d through a structured sequence of API calls:

1. Parsing and Demuxing

First, libavif parses the AVIF container. It identifies the image properties, such as width, height, bit depth, and color configuration. It then extracts the compressed AV1 payload (the “item” data representing the image).

2. Initializing the dav1d Context

Once the compressed AV1 payload is isolated, libavif initializes a dav1d decoder context. If libavif was compiled with dav1d support (which is standard in modern web browsers and operating systems), it configures dav1d settings, such as the number of CPU threads to allocate for decoding.

3. Passing the Bitstream to dav1d

libavif feeds the raw AV1 bitstream into the dav1d decoder using dav1d’s input API (dav1d_send_data). The dav1d engine then parses the AV1 sequence headers, frame headers, and tile groups.

4. Hardware-Accelerated Decoding

Inside dav1d, the heavy mathematical lifting occurs. dav1d utilizes highly optimized assembly code (written for specific CPU architectures like x86 AVX2/AVX-512 and ARM NEON) to perform inverse quantization, inverse transforms, and loop filtering. It reconstructs the image pixels and outputs a raw image frame.

5. Retrieving the Raw Frame

Once decoding is complete, libavif retrieves the raw frame from the decoder using the dav1d_recv_frame API. This frame is typically in a YUV color format (such as YUV 4:2:0 or YUV 4:4:4).

6. Post-Processing and Output

Finally, libavif takes the YUV pixel buffer received from dav1d, applies any container-level transformations specified in the AVIF file (such as rotation or cropping), and converts the YUV pixels into the RGB format required by the host application.

Why libavif Relies on dav1d

While libavif can technically use other AV1 decoders like Google’s libaom, dav1d is the industry standard for decoding for several reasons:

Speed and Performance: dav1d was built from the ground up to be the fastest software AV1 decoder, relying heavily on hand-written assembly optimizations.
Low Memory Footprint: It is designed to be lightweight, making it suitable for memory-constrained environments like web browsers and mobile devices.
Excellent Multi-threading: dav1d features advanced tile-based and frame-based multi-threading, allowing it to decode high-resolution AVIF images almost instantly on multi-core processors.