* [FFmpeg-devel] [RFC] swscale dithering
@ 2025-03-24 12:43 Niklas Haas
2025-03-24 20:04 ` Michael Niedermayer
0 siblings, 1 reply; 3+ messages in thread
From: Niklas Haas @ 2025-03-24 12:43 UTC (permalink / raw)
To: ffmpeg-devel
Hi all,
As part of my ongoing swscale rewrite, we have both the opportunity and the
need to make a central decision about how to apply rounding and/or dithering.
Some particular cases I want to point out and gather feedback on include:
1. Should we dither and/or accurately round when scaling up full range
content? For example, say you are converting from full-range rgb24 to
rgb30. The correct conversion is (rgb / 255 * 1023), which involves a
rational factor of exactly 341 / 85, or roughly 4.01176. The fact that this
factor is irrational means that an exact conversion without dithering,
while not strictly speaking *lossy*, necessarily introduces rounding error.
An input value of 200, for example, gives 200 * 1023 / 255 = 802.35294...,
which ought to be accurately dithered down to a 35%/65% mix of 802 and 803.
This is not what current swscale (nor many other pieces of software) do,
instead they simply calculate the much easier (x << 2) | (x >> 6). This
amounts to chopping off the lowest 6 bits. i.e. truncating down. With a
light bit of extra effort we can at least round correctly by adding on the
(x & 5) bit to the result.
This is especially problematic for the alpha channel, as a correct
upconversion of yuva444p to yuva444p10 would otherwise collapse to a simple
left shift by 2 if not for the presence of the alpha channel which would
require a full float conversion, multiplication and dither pass.
2. At what bit depth does dithering become negligible? For context, the
generally quoted threshold of human visual perception is ~12 bits SDR and
~14 bits HDR. So for something like yuv444p16, we could get away with
outputting the truncated results without dithering nor accurate rounding,
without the risk of human visible error. However, this does increase the
risk of a *compounding* error as more and more conversions are performed.
3. Should we dither per-channel after conversion from grayscale to RGB? For
example, say I am converting gray10 to rgb24. The most performance way to
do this would be to dither the gray channel down to gray8 and then copy it
to all three values (R, G, B) = (Y8). The more accurate way to do it, OTOH,
would be to set (R10, G10, B10) = (Y10) and then dither each channel
independently, with an offset dither mask per channel. This gives greater
precision, which may matter especially when dithering to a very low bit
depth (e.g. rgb8 or rgb4), but makes the conversion roughly 3x more
expensive.
3. What should we make of the SWS_ACCURATE_RND and SWS_BITEXACT flags? I am
personally thinking that SWS_BITEXACT should become a no-op flag, with
bit exact output being the default behavior of all new implementations.
But What about SWS_ACCURATE_RND?
I am thinking that SWS_ACCURATE_RND should essentially be the switch that
toggles our preferred resolution of question 1. So in other words, with
SWS_ACCURATE_RND specified, full range upconversions should go through an
accurate dither pass, while being relaxed to the simple (x << 2) | (x >> 6)
upconversion in the absence of this flag.
How should this flag relate to question 2? With the flag specified, I am
thinking that we should also force dithering even at 16 bit depth, and
skip dithering in this case only in the flag's absence. If so, what
bit depth should the cutoff threshold be, for when to skip accurate
dithering? I am thinking to simply use the 12/14 bit SDR/HDR threshold as
appropriate for the content type.
This would lead to the following conversions, as an illustration:
SWS_ACCURATE_RND specified:
- rgb24 -> yuv420p10: full dithering
- rgb24 -> yuv420p12: full dithering
- rgb24 -> rgb30: full dithering
- rgb24 -> rgba64: full dithering
- yuva444p -> yuva444p10: scale YUV, dither alpha
- yuva444p14 -> yuva444p16: scale YUV, dither alpha
- yuv444p10 -> yuv444p14: left shift, no dithering needed
SWS_ACCURATE_RND absent:
- rgb24 -> yuv420p10: full dithering
- rgb24 -> yuv420p12: truncate if SDR, full dithering if HDR
- rgb24 -> rgb30: truncate
- rgb24 -> rgba64: truncate
- yuva444p -> yuva444p10: left shift YUV, truncate alpha
- yuva444p14 -> yuva444p16: left shift YUV, truncate alpha
Does this seem reasonable?
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [RFC] swscale dithering
2025-03-24 12:43 [FFmpeg-devel] [RFC] swscale dithering Niklas Haas
@ 2025-03-24 20:04 ` Michael Niedermayer
2025-03-25 1:59 ` Niklas Haas
0 siblings, 1 reply; 3+ messages in thread
From: Michael Niedermayer @ 2025-03-24 20:04 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 3442 bytes --]
Hi Niklas
On Mon, Mar 24, 2025 at 01:43:19PM +0100, Niklas Haas wrote:
> Hi all,
>
> As part of my ongoing swscale rewrite, we have both the opportunity and the
> need to make a central decision about how to apply rounding and/or dithering.
>
> Some particular cases I want to point out and gather feedback on include:
>
all IMHO:
> 1. Should we dither and/or accurately round when scaling up full range
> content? For example, say you are converting from full-range rgb24 to
by default, yes
[...]
> 2. At what bit depth does dithering become negligible? For context, the
I think we should consistently always apply it by default
also especially if someone does work with lets say 32bit, that person has
some strange requirements already and direct human vison may not be it.
[...]
> 3. Should we dither per-channel after conversion from grayscale to RGB? For
in general, yes
[...]
>
> 3. What should we make of the SWS_ACCURATE_RND and SWS_BITEXACT flags? I am
> personally thinking that SWS_BITEXACT should become a no-op flag, with
> bit exact output being the default behavior of all new implementations.
> But What about SWS_ACCURATE_RND?
>
> I am thinking that SWS_ACCURATE_RND should essentially be the switch that
> toggles our preferred resolution of question 1. So in other words, with
> SWS_ACCURATE_RND specified, full range upconversions should go through an
> accurate dither pass, while being relaxed to the simple (x << 2) | (x >> 6)
> upconversion in the absence of this flag.
>
> How should this flag relate to question 2? With the flag specified, I am
> thinking that we should also force dithering even at 16 bit depth, and
> skip dithering in this case only in the flag's absence. If so, what
> bit depth should the cutoff threshold be, for when to skip accurate
> dithering? I am thinking to simply use the 12/14 bit SDR/HDR threshold as
> appropriate for the content type.
>
> This would lead to the following conversions, as an illustration:
>
> SWS_ACCURATE_RND specified:
>
> - rgb24 -> yuv420p10: full dithering
> - rgb24 -> yuv420p12: full dithering
> - rgb24 -> rgb30: full dithering
> - rgb24 -> rgba64: full dithering
> - yuva444p -> yuva444p10: scale YUV, dither alpha
> - yuva444p14 -> yuva444p16: scale YUV, dither alpha
> - yuv444p10 -> yuv444p14: left shift, no dithering needed
>
> SWS_ACCURATE_RND absent:
>
> - rgb24 -> yuv420p10: full dithering
> - rgb24 -> yuv420p12: truncate if SDR, full dithering if HDR
> - rgb24 -> rgb30: truncate
> - rgb24 -> rgba64: truncate
> - yuva444p -> yuva444p10: left shift YUV, truncate alpha
> - yuva444p14 -> yuva444p16: left shift YUV, truncate alpha
>
> Does this seem reasonable?
IMHO in the accurate mode, dither should always be on
its also easier to understand
but there could be a flag for vissual percetion
SWS_VISSUAL_PERCEPTION (or some better name)
some flag that uses less accurate and faster operations when their
effect is expected to be vissually impercivable
(this may be percievable when contrast, color, resolution or other
is changed)
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have often repented speaking, but never of holding my tongue.
-- Xenocrates
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [FFmpeg-devel] [RFC] swscale dithering
2025-03-24 20:04 ` Michael Niedermayer
@ 2025-03-25 1:59 ` Niklas Haas
0 siblings, 0 replies; 3+ messages in thread
From: Niklas Haas @ 2025-03-25 1:59 UTC (permalink / raw)
To: FFmpeg development discussions and patches
On Mon, 24 Mar 2025 21:04:46 +0100 Michael Niedermayer <michael@niedermayer.cc> wrote:
> Hi Niklas
>
> On Mon, Mar 24, 2025 at 01:43:19PM +0100, Niklas Haas wrote:
> > Hi all,
> >
> > As part of my ongoing swscale rewrite, we have both the opportunity and the
> > need to make a central decision about how to apply rounding and/or dithering.
> >
> > Some particular cases I want to point out and gather feedback on include:
> >
>
> all IMHO:
>
>
> > 1. Should we dither and/or accurately round when scaling up full range
> > content? For example, say you are converting from full-range rgb24 to
>
> by default, yes
>
>
> [...]
>
> > 2. At what bit depth does dithering become negligible? For context, the
>
> I think we should consistently always apply it by default
>
> also especially if someone does work with lets say 32bit, that person has
> some strange requirements already and direct human vison may not be it.
>
>
> [...]
>
> > 3. Should we dither per-channel after conversion from grayscale to RGB? For
>
> in general, yes
>
>
> [...]
> >
> > 3. What should we make of the SWS_ACCURATE_RND and SWS_BITEXACT flags? I am
> > personally thinking that SWS_BITEXACT should become a no-op flag, with
> > bit exact output being the default behavior of all new implementations.
> > But What about SWS_ACCURATE_RND?
> >
> > I am thinking that SWS_ACCURATE_RND should essentially be the switch that
> > toggles our preferred resolution of question 1. So in other words, with
> > SWS_ACCURATE_RND specified, full range upconversions should go through an
> > accurate dither pass, while being relaxed to the simple (x << 2) | (x >> 6)
> > upconversion in the absence of this flag.
> >
> > How should this flag relate to question 2? With the flag specified, I am
> > thinking that we should also force dithering even at 16 bit depth, and
> > skip dithering in this case only in the flag's absence. If so, what
> > bit depth should the cutoff threshold be, for when to skip accurate
> > dithering? I am thinking to simply use the 12/14 bit SDR/HDR threshold as
> > appropriate for the content type.
> >
> > This would lead to the following conversions, as an illustration:
> >
> > SWS_ACCURATE_RND specified:
> >
> > - rgb24 -> yuv420p10: full dithering
> > - rgb24 -> yuv420p12: full dithering
> > - rgb24 -> rgb30: full dithering
> > - rgb24 -> rgba64: full dithering
> > - yuva444p -> yuva444p10: scale YUV, dither alpha
> > - yuva444p14 -> yuva444p16: scale YUV, dither alpha
> > - yuv444p10 -> yuv444p14: left shift, no dithering needed
> >
> > SWS_ACCURATE_RND absent:
> >
> > - rgb24 -> yuv420p10: full dithering
> > - rgb24 -> yuv420p12: truncate if SDR, full dithering if HDR
> > - rgb24 -> rgb30: truncate
> > - rgb24 -> rgba64: truncate
> > - yuva444p -> yuva444p10: left shift YUV, truncate alpha
> > - yuva444p14 -> yuva444p16: left shift YUV, truncate alpha
> >
> > Does this seem reasonable?
>
> IMHO in the accurate mode, dither should always be on
> its also easier to understand
That seems reasonable. It does mean some conversions are necessarily going
to get slower than the status quo.
>
> but there could be a flag for vissual percetion
>
> SWS_VISSUAL_PERCEPTION (or some better name)
> some flag that uses less accurate and faster operations when their
> effect is expected to be vissually impercivable
This could be the behavior of SWS_DITHER_AUTO, which is not clearly defined
either way.
In a much earlier thread we discussed the idea of adding quality "presets",
which seems like a good thing to consider as well.
How about this proposal?
1. SWS_ACCURATE_RND is added to the default flags.
2. When SWS_ACCURATE_RND is absent, the implementation may truncate instead
of rounding; to enable e.g. fast alpha upconversions.
3. SWS_DITHER_AUTO implies dithering only when the result is visually needed,
and skips dithering otherwise
4. When a specific dither mode is requested, dithering is always performed,
at whatever bit depth.
Tying this into the quality presets, I suggest the "default" quality prefix
imply (accurate_rnd + dither=auto). One of the "slower" presets could
always enable dithering, while one of the "faster" presets could skip
accurate rounding.
>
> thx
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> I have often repented speaking, but never of holding my tongue.
> -- Xenocrates
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-03-25 1:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-24 12:43 [FFmpeg-devel] [RFC] swscale dithering Niklas Haas
2025-03-24 20:04 ` Michael Niedermayer
2025-03-25 1:59 ` Niklas Haas
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git