How libavif leverages SVT-AV1 for AVIF encoding
This article explains how the libavif library utilizes
the SVT-AV1 encoder to produce high-quality, highly
compressed AVIF images. It covers the architectural relationship between
the library and the encoder, the performance benefits of this
integration, and how the underlying technology processes image data.
The Relationship Between libavif and SVT-AV1
libavif is a portable C library used for encoding and
decoding AVIF (AV1 Image File Format) files. However,
libavif itself does not contain the core algorithms
required to compress raw pixels into AV1 bitstreams. Instead, it acts as
a wrapper and muxer that relies on external codec libraries to perform
the heavy lifting.
SVT-AV1 (Scalable Video Technology for AV1) is one of
the primary encoding backends supported by libavif.
Developed by Intel and the Alliance for Open Media (AOM), SVT-AV1 is a
highly optimized, CPU-based AV1 video encoder. When configured to use
SVT-AV1, libavif passes raw image data to the encoder and
receives a compliant AV1 compressed payload, which it then packages into
the final HEIF/AVIF container.
How libavif Processes Images Using SVT-AV1
1. Mapping Images to Video Frames
Because SVT-AV1 is natively a video encoder, libavif
must translate still image properties into video parameters. *
Still Images: For standard AVIF images,
libavif presents the single image to SVT-AV1 as a video
sequence consisting of exactly one frame. This frame is encoded as an
“Intra-frame” (I-frame or Key Frame), meaning it does not rely on
temporal references to other frames. * Animated AVIFs:
For animated AVIFs, libavif passes a sequence of source
images to SVT-AV1, which then utilizes inter-frame compression
techniques (predicting pixel movement across frames) to significantly
reduce the file size of the animation.
2. Translating Settings and Quality Parameters
libavif exposes standard user controls—such as speed
presets, target quality, and color depth—and maps them to SVT-AV1’s
internal APIs: * Speed/Complexity: SVT-AV1 offers
preset levels ranging from 0 (slowest, highest quality) to 13 (fastest,
lowest quality). libavif maps its own speed settings
(usually scale 0–10) directly to these presets. * Quantization
(QP): Quality levels in libavif are translated
into SVT-AV1’s Quantization Parameters. SVT-AV1 uses these parameters to
determine how aggressively to compress high-frequency detail. *
Color Profiles: libavif passes metadata
regarding chroma subsampling (e.g., YUV 4:2:0, 4:2:2, or 4:4:4) and bit
depth (8-bit, 10-bit, or 12-bit) directly to SVT-AV1 to ensure precise
color representation.
3. Leveraging Multi-Threading and Modern CPU Architectures
The primary advantage of leveraging SVT-AV1 over other AV1 encoders
(like libaom) is its architectural design. SVT-AV1 is built
specifically to scale efficiently across modern multi-core
processors.
When libavif invokes SVT-AV1, the encoder divides the
image into independent segments (tiles and columns) and processes them
in parallel across multiple CPU threads. This multi-threaded approach
dramatically reduces the time required to encode high-resolution images,
making AVIF creation viable for on-the-fly web server image generation
and batch processing workflows.