Re: [FFmpeg-devel] The patch series about premultiplied alpha

From: Niklas Haas <ffmpeg@haasn.xyz>
To: tc@ffmpeg.org
Cc: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] The patch series about premultiplied alpha
Date: Sun, 3 Aug 2025 12:42:37 +0200
Message-ID: <20250803124237.GB4581@haasn.xyz> (raw)
In-Reply-To: <aI5S6y-XCiSYHrVp@phare.normalesup.org>

Hi all,

For context, this is the patch series being discussed:
https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20031

A description of my position follows.

On Sat, 02 Aug 2025 20:03:23 +0200 Nicolas George <george@nsup.org> wrote:
> Nicolas George (HE12025-08-02):
> > - The patch series lacks transverse user documentation.
> >
> > - The patch series increases the risk of data corruption due to user
> >   negligence and should include at least some guardrails against that.
>
> Here is the situation before this patch series:
>
> - All alpha is supposed to be straight by default.

I would weaken this statement a bit. While I agree that the majority of
files and filters assume straight alpha, the claim that this represents
the *ought* state (and not merely the *is* state) is not supported by the
documentation.

> - A few niche decoders (EXR, JPEG XL) would silently output
>   premultiplied alpha.
>
> - A few niche encoders (the same) would silently assume their input is
>   premultiplied alpha.
>
> - There is a pair of filters premultiply / unpremultiply.
>
> - The overlay filter supports premultiplied alpha if enabled with an
>   undocumented option.
>
> - There is no protection at all: if a command connects input with one
>   kind of alpha with out expecting the other kind at any point, it will
>   produce invalid output without so much as a warning.
>
> - There is barely any documentation at all. The filters are documented
>   for what they do and that is all.

I otherwise agree with this summary of the status quo.

> Documentation is easy: just write a few paragraphs to explain what
> premultiplied alpha is, where it likely to occur (what did motivate the
> writing of this series?) and how to deal with it (links to the
> conversion filters).

I don't think this is the main source of disagreement, so I will be brief.
In a nutshell, I think that it's not generally FFmpeg's job to go beyond a
working explanation of image processing concepts that are already well-
established in the industry. We also don't document in detail how and why
chroma subsampling is used, or how HDR metadata is interpreted, etc.

It's easy enough for any competent engineer to find out this information,
from sources which will surely outlive the FFmpeg project, e.g.:
https://en.wikipedia.org/wiki/Alpha_compositing#Straight_versus_premultiplied

That said, I don't think this is a relevant point of discussion, so as a
compromise I suggest to just expand the doxygen comments above the newly
added `enum AVAlphaMode` to elaborate at least on what the practical
difference is and why you would choose one over the other.

> Protection against creating invalid output, now. As I mentioned above,
> full negotiation and automatic conversion, which would be ideal, is too
> much work and therefore too much to ask. Niklas agree on that.
>
> On the other hand, protection, i.e. emitting an error and letting the
> user deal with it, is not much work:
>
> - add a flag to filters that support premultiplied input;
>
> - add a check for that flag in the framework.
>
> In total, maybe two lines in a header, four lines in framework code and
> one line per filter that will not damage the output. Similar work has
> already be done for encoders.

I think that such a flag will not actually accomplish what Nicolas thinks it
would. When I pointed out that we already have such a flag for encoders and
asked for clarification of what problem Nicolas was trying to solve, he stated:

> That handles it for encoders, I suppose. But I do not see anything
> protecting you from stacking images with different kind of alpha or
> sending this kind of frames to a muxer with uncoded frames.

I argue that a simple flag can not possibly solve this problem in any
meaningful way. The nuance here is that the problem is not one of only
supporting premul vs only supporting straight alpha, but rather the fact that
vf_vstack (e.g.) wants its two inputs to merely *agree* on their alpha type.

What makes this a nontrivial problem to solve is that the libavfilter code has
no way of knowing that two *unrelated* inputs to a filter are supposed to have
a matching alpha mode (and indeed, e.g. vf_overlay explicitly supports any X/Y
combination of blending transparent images onto other transparent images, thus
making it a counterexample to the general assumption that all inputs should be
matching).

The way we solve this normally in libavfilter is by having an AVFilterFormats
list for each property that we want to negotiate, and then filters can (during
query_formats) set both inputs to the same list reference, which in turn allows
the common code to understand that they should be resolved to the same alpha
mode). This is what I have repeatedly explained will involve adding full
format negotiation, which we both agree is out of scope.

----

Furthermore, Nicolas has the situation reversed: it is not the case that some
filters "support premultiplied input", and the rest implicitly do not. Rather,
*all* filters support both alpha modes, with basically no current exceptions
except for a few obscure ones. (vf_overlay_qsv and vf_overlay_cuda in particular
come to mind, though I suppose vf_fade also needs to be updated to support
fading premultiplied input frames - that's a trivial fix and I will add it
to my PR in a bit)

If we really do care about adding extra errors for these remaining filters
specifically, I can of course add it, but I argue that this is completely
orthogonal to the two problems Nicolas mentioned in his objection.

Finally, I do not see an issue with putting uncoded premultiplied frames into
muxers. As far as I'm aware, there is no current expectation that uncoded
muxers only support certain types of inputs; there is no precedent for blocking
e.g. HDR frames, or YCgGo frames, or any other number of very-unlikely-to-be-
handled-correctly inputs.

>
> Such limited work would make the feature much less dangerous for users,
> it should be done before the feature reaches the public.
>
> Regards,
>
> --
>   Nicolas George
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".