How libavif Uses dav1d for AVIF Decoding
This article explains how the libavif library utilizes
the highly optimized dav1d decoder to translate compressed
AV1 bitstreams into renderable AVIF images. It covers the structural
relationship between the AVIF container and the AV1 codec, the
step-by-step decoding pipeline, and why dav1d is the
preferred decoding engine for modern AVIF image processing.
The Relationship Between AVIF, libavif, and dav1d
To understand how libavif uses dav1d, it is
essential to understand what an AVIF file actually is. AVIF (AV1 Image
File Format) is not a brand-new image compression algorithm; rather, it
wraps highly compressed AV1 video keyframes inside an ISOBMFF (ISO Base
Media File Format) container.
Because of this hybrid structure, decoding an AVIF image requires two distinct operations: 1. Container Parsing: Reading the file structure, extracting metadata (such as Exif, XMP, and ICC color profiles), and locating the raw compressed image data. 2. Image Decoding: Decompressing the raw AV1 video bitstream into raw pixel data.
The libavif library is responsible for the first task
(parsing the container). However, libavif does not actually
contain an AV1 decoder. Instead, it acts as a wrapper and delegates the
complex, CPU-intensive task of decompressing the AV1 bitstream to an
external AV1 decoder. dav1d, developed by the VideoLAN
project, is the primary and most popular decoder library used by
libavif for this purpose.
The Step-by-Step Decoding Workflow
When an application requests libavif to decode an AVIF
image, the library coordinates with dav1d through a
structured sequence of API calls:
1. Parsing and Demuxing
First, libavif parses the AVIF container. It identifies
the image properties, such as width, height, bit depth, and color
configuration. It then extracts the compressed AV1 payload (the “item”
data representing the image).
2. Initializing the dav1d Context
Once the compressed AV1 payload is isolated, libavif
initializes a dav1d decoder context. If
libavif was compiled with dav1d support (which
is standard in modern web browsers and operating systems), it configures
dav1d settings, such as the number of CPU threads to
allocate for decoding.
3. Passing the Bitstream to dav1d
libavif feeds the raw AV1 bitstream into the
dav1d decoder using dav1d’s input API
(dav1d_send_data). The dav1d engine then
parses the AV1 sequence headers, frame headers, and tile groups.
4. Hardware-Accelerated Decoding
Inside dav1d, the heavy mathematical lifting occurs.
dav1d utilizes highly optimized assembly code (written for
specific CPU architectures like x86 AVX2/AVX-512 and ARM NEON) to
perform inverse quantization, inverse transforms, and loop filtering. It
reconstructs the image pixels and outputs a raw image frame.
5. Retrieving the Raw Frame
Once decoding is complete, libavif retrieves the raw
frame from the decoder using the dav1d_recv_frame API. This
frame is typically in a YUV color format (such as YUV 4:2:0 or YUV
4:4:4).
6. Post-Processing and Output
Finally, libavif takes the YUV pixel buffer received
from dav1d, applies any container-level transformations
specified in the AVIF file (such as rotation or cropping), and converts
the YUV pixels into the RGB format required by the host application.
Why libavif Relies on dav1d
While libavif can technically use other AV1 decoders
like Google’s libaom, dav1d is the industry
standard for decoding for several reasons:
- Speed and Performance:
dav1dwas built from the ground up to be the fastest software AV1 decoder, relying heavily on hand-written assembly optimizations. - Low Memory Footprint: It is designed to be lightweight, making it suitable for memory-constrained environments like web browsers and mobile devices.
- Excellent Multi-threading:
dav1dfeatures advanced tile-based and frame-based multi-threading, allowing it to decode high-resolution AVIF images almost instantly on multi-core processors.