How libavif leverages SVT-AV1 for AVIF encoding

This article explains how the libavif library utilizes the SVT-AV1 encoder to produce high-quality, highly compressed AVIF images. It covers the architectural relationship between the library and the encoder, the performance benefits of this integration, and how the underlying technology processes image data.

The Relationship Between libavif and SVT-AV1

libavif is a portable C library used for encoding and decoding AVIF (AV1 Image File Format) files. However, libavif itself does not contain the core algorithms required to compress raw pixels into AV1 bitstreams. Instead, it acts as a wrapper and muxer that relies on external codec libraries to perform the heavy lifting.

SVT-AV1 (Scalable Video Technology for AV1) is one of the primary encoding backends supported by libavif. Developed by Intel and the Alliance for Open Media (AOM), SVT-AV1 is a highly optimized, CPU-based AV1 video encoder. When configured to use SVT-AV1, libavif passes raw image data to the encoder and receives a compliant AV1 compressed payload, which it then packages into the final HEIF/AVIF container.

How libavif Processes Images Using SVT-AV1

1. Mapping Images to Video Frames

Because SVT-AV1 is natively a video encoder, libavif must translate still image properties into video parameters. * Still Images: For standard AVIF images, libavif presents the single image to SVT-AV1 as a video sequence consisting of exactly one frame. This frame is encoded as an “Intra-frame” (I-frame or Key Frame), meaning it does not rely on temporal references to other frames. * Animated AVIFs: For animated AVIFs, libavif passes a sequence of source images to SVT-AV1, which then utilizes inter-frame compression techniques (predicting pixel movement across frames) to significantly reduce the file size of the animation.

2. Translating Settings and Quality Parameters

libavif exposes standard user controls—such as speed presets, target quality, and color depth—and maps them to SVT-AV1’s internal APIs: * Speed/Complexity: SVT-AV1 offers preset levels ranging from 0 (slowest, highest quality) to 13 (fastest, lowest quality). libavif maps its own speed settings (usually scale 0–10) directly to these presets. * Quantization (QP): Quality levels in libavif are translated into SVT-AV1’s Quantization Parameters. SVT-AV1 uses these parameters to determine how aggressively to compress high-frequency detail. * Color Profiles: libavif passes metadata regarding chroma subsampling (e.g., YUV 4:2:0, 4:2:2, or 4:4:4) and bit depth (8-bit, 10-bit, or 12-bit) directly to SVT-AV1 to ensure precise color representation.

3. Leveraging Multi-Threading and Modern CPU Architectures

The primary advantage of leveraging SVT-AV1 over other AV1 encoders (like libaom) is its architectural design. SVT-AV1 is built specifically to scale efficiently across modern multi-core processors.

When libavif invokes SVT-AV1, the encoder divides the image into independent segments (tiles and columns) and processes them in parallel across multiple CPU threads. This multi-threaded approach dramatically reduces the time required to encode high-resolution images, making AVIF creation viable for on-the-fly web server image generation and batch processing workflows.