How libavif Encodes Images with Alpha Channels

This article explains how the libavif library handles and encodes transparency (alpha channels) when creating AVIF images. It covers the separation of color and transparency data, the dual-stream encoding process using the AV1 codec, and how these elements are reconstructed by decoders to render transparent images.

The Dual-Stream Architecture

Unlike older image formats that compress red, green, blue, and alpha (RGBA) channels together in a single pixel interleaved stream, AVIF handles transparency by separating the image into two distinct streams. This architecture is a result of AVIF being based on the AV1 video compression standard, which natively compresses YUV or RGB color channels but does not have a native “RGBA” mode.

To bypass this limitation, libavif splits an incoming transparent image into two parts: 1. The Primary Image: Containing the RGB color data. 2. The Auxiliary Image: Containing the monochrome alpha (transparency) data.

The Encoding Process in libavif

When you pass a transparent image (such as a PNG with an alpha channel) to libavif, the library executes the following steps:

1. Channel Separation

The library receives the input pixels and separates the color channels (RGB) from the alpha channel (A). The alpha channel is extracted as a single-channel, grayscale image where white represents complete opacity and black represents complete transparency.

2. Independent Compression

Both the color image and the alpha image are compressed independently using an underlying AV1 encoder (such as aom, rav1e, or svt-av1): * Color Stream: The RGB data is typically converted to YUV color space (often YUV 420, 422, or 444) and compressed. * Alpha Stream: The alpha data is compressed as a monochrome AV1 bitstream. Because it is compressed independently, libavif allows you to set different quality levels for color and transparency. For example, you can highly compress the color data while keeping the alpha channel lossless to maintain sharp, clean edges.

3. Container Packaging (ISOBMFF)

Once both bitstreams are compressed, libavif packages them into the HEIF container format (specifically, the ISO Base Media File Format).

Within the container, the color bitstream is marked as the primary item. The alpha bitstream is stored as an “auxiliary item” and is linked to the primary item using a specific association type (auxl). Metadata within the container explicitly flags this auxiliary stream as the alpha channel, telling decoders exactly how to interpret the grayscale data.

Decoding and Reconstitution

When an AVIF-compatible viewer or web browser opens the image, the decoder reverses this process. It decodes the primary color stream and the auxiliary alpha stream simultaneously. It then aligns the two streams and merges the grayscale alpha map back onto the color channels to output the final transparent RGBA image on the screen.