From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 743704C37B for ; Sun, 3 Aug 2025 10:42:54 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 03DA8687D10; Sun, 3 Aug 2025 13:42:50 +0300 (EEST) Received: from haasn.dev (haasn.dev [78.46.187.166]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 19B3168BD10; Sun, 3 Aug 2025 13:42:38 +0300 (EEST) Received: from haasn.dev (unknown [10.30.1.1]) by haasn.dev (Postfix) with UTF8SMTP id 93A7E4079C; Sun, 3 Aug 2025 12:42:37 +0200 (CEST) Date: Sun, 3 Aug 2025 12:42:37 +0200 Message-ID: <20250803124237.GB4581@haasn.xyz> From: Niklas Haas To: tc@ffmpeg.org In-Reply-To: References: MIME-Version: 1.0 Content-Disposition: inline Subject: Re: [FFmpeg-devel] The patch series about premultiplied alpha X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Hi all, For context, this is the patch series being discussed: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20031 A description of my position follows. On Sat, 02 Aug 2025 20:03:23 +0200 Nicolas George wrote: > Nicolas George (HE12025-08-02): > > - The patch series lacks transverse user documentation. > > > > - The patch series increases the risk of data corruption due to user > > negligence and should include at least some guardrails against that. > > Here is the situation before this patch series: > > - All alpha is supposed to be straight by default. I would weaken this statement a bit. While I agree that the majority of files and filters assume straight alpha, the claim that this represents the *ought* state (and not merely the *is* state) is not supported by the documentation. > - A few niche decoders (EXR, JPEG XL) would silently output > premultiplied alpha. > > - A few niche encoders (the same) would silently assume their input is > premultiplied alpha. > > - There is a pair of filters premultiply / unpremultiply. > > - The overlay filter supports premultiplied alpha if enabled with an > undocumented option. > > - There is no protection at all: if a command connects input with one > kind of alpha with out expecting the other kind at any point, it will > produce invalid output without so much as a warning. > > - There is barely any documentation at all. The filters are documented > for what they do and that is all. I otherwise agree with this summary of the status quo. > Documentation is easy: just write a few paragraphs to explain what > premultiplied alpha is, where it likely to occur (what did motivate the > writing of this series?) and how to deal with it (links to the > conversion filters). I don't think this is the main source of disagreement, so I will be brief. In a nutshell, I think that it's not generally FFmpeg's job to go beyond a working explanation of image processing concepts that are already well- established in the industry. We also don't document in detail how and why chroma subsampling is used, or how HDR metadata is interpreted, etc. It's easy enough for any competent engineer to find out this information, from sources which will surely outlive the FFmpeg project, e.g.: https://en.wikipedia.org/wiki/Alpha_compositing#Straight_versus_premultiplied That said, I don't think this is a relevant point of discussion, so as a compromise I suggest to just expand the doxygen comments above the newly added `enum AVAlphaMode` to elaborate at least on what the practical difference is and why you would choose one over the other. > Protection against creating invalid output, now. As I mentioned above, > full negotiation and automatic conversion, which would be ideal, is too > much work and therefore too much to ask. Niklas agree on that. > > On the other hand, protection, i.e. emitting an error and letting the > user deal with it, is not much work: > > - add a flag to filters that support premultiplied input; > > - add a check for that flag in the framework. > > In total, maybe two lines in a header, four lines in framework code and > one line per filter that will not damage the output. Similar work has > already be done for encoders. I think that such a flag will not actually accomplish what Nicolas thinks it would. When I pointed out that we already have such a flag for encoders and asked for clarification of what problem Nicolas was trying to solve, he stated: > That handles it for encoders, I suppose. But I do not see anything > protecting you from stacking images with different kind of alpha or > sending this kind of frames to a muxer with uncoded frames. I argue that a simple flag can not possibly solve this problem in any meaningful way. The nuance here is that the problem is not one of only supporting premul vs only supporting straight alpha, but rather the fact that vf_vstack (e.g.) wants its two inputs to merely *agree* on their alpha type. What makes this a nontrivial problem to solve is that the libavfilter code has no way of knowing that two *unrelated* inputs to a filter are supposed to have a matching alpha mode (and indeed, e.g. vf_overlay explicitly supports any X/Y combination of blending transparent images onto other transparent images, thus making it a counterexample to the general assumption that all inputs should be matching). The way we solve this normally in libavfilter is by having an AVFilterFormats list for each property that we want to negotiate, and then filters can (during query_formats) set both inputs to the same list reference, which in turn allows the common code to understand that they should be resolved to the same alpha mode). This is what I have repeatedly explained will involve adding full format negotiation, which we both agree is out of scope. ---- Furthermore, Nicolas has the situation reversed: it is not the case that some filters "support premultiplied input", and the rest implicitly do not. Rather, *all* filters support both alpha modes, with basically no current exceptions except for a few obscure ones. (vf_overlay_qsv and vf_overlay_cuda in particular come to mind, though I suppose vf_fade also needs to be updated to support fading premultiplied input frames - that's a trivial fix and I will add it to my PR in a bit) If we really do care about adding extra errors for these remaining filters specifically, I can of course add it, but I argue that this is completely orthogonal to the two problems Nicolas mentioned in his objection. Finally, I do not see an issue with putting uncoded premultiplied frames into muxers. As far as I'm aware, there is no current expectation that uncoded muxers only support certain types of inputs; there is no precedent for blocking e.g. HDR frames, or YCgGo frames, or any other number of very-unlikely-to-be- handled-correctly inputs. > > Such limited work would make the feature much less dangerous for users, > it should be done before the feature reaches the public. > > Regards, > > -- > Nicolas George _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".