How libavif Encodes Images with Alpha Channels
This article explains how the libavif library handles
and encodes transparency (alpha channels) when creating AVIF images. It
covers the separation of color and transparency data, the dual-stream
encoding process using the AV1 codec, and how these elements are
reconstructed by decoders to render transparent images.
The Dual-Stream Architecture
Unlike older image formats that compress red, green, blue, and alpha (RGBA) channels together in a single pixel interleaved stream, AVIF handles transparency by separating the image into two distinct streams. This architecture is a result of AVIF being based on the AV1 video compression standard, which natively compresses YUV or RGB color channels but does not have a native “RGBA” mode.
To bypass this limitation, libavif splits an incoming
transparent image into two parts: 1. The Primary Image:
Containing the RGB color data. 2. The Auxiliary Image:
Containing the monochrome alpha (transparency) data.
The Encoding Process in libavif
When you pass a transparent image (such as a PNG with an alpha
channel) to libavif, the library executes the following
steps:
1. Channel Separation
The library receives the input pixels and separates the color channels (RGB) from the alpha channel (A). The alpha channel is extracted as a single-channel, grayscale image where white represents complete opacity and black represents complete transparency.
2. Independent Compression
Both the color image and the alpha image are compressed independently
using an underlying AV1 encoder (such as aom,
rav1e, or svt-av1): * Color
Stream: The RGB data is typically converted to YUV color space
(often YUV 420, 422, or 444) and compressed. * Alpha
Stream: The alpha data is compressed as a monochrome AV1
bitstream. Because it is compressed independently, libavif
allows you to set different quality levels for color and transparency.
For example, you can highly compress the color data while keeping the
alpha channel lossless to maintain sharp, clean edges.
3. Container Packaging (ISOBMFF)
Once both bitstreams are compressed, libavif packages
them into the HEIF container format (specifically, the ISO Base Media
File Format).
Within the container, the color bitstream is marked as the primary
item. The alpha bitstream is stored as an “auxiliary item” and is linked
to the primary item using a specific association type
(auxl). Metadata within the container explicitly flags this
auxiliary stream as the alpha channel, telling decoders exactly how to
interpret the grayscale data.
Decoding and Reconstitution
When an AVIF-compatible viewer or web browser opens the image, the decoder reverses this process. It decodes the primary color stream and the auxiliary alpha stream simultaneously. It then aligns the two streams and merges the grayscale alpha map back onto the color channels to output the final transparent RGBA image on the screen.