From: Niklas Haas <ffmpeg@haasn.xyz>
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [RFC] New swscale internal design prototype
Date: Sun, 9 Mar 2025 22:28:07 +0100
Message-ID: <20250309222807.GB706835@haasn.xyz> (raw)
In-Reply-To: <20250309221349.GG683063@haasn.xyz>
On Sun, 09 Mar 2025 22:13:49 +0100 Niklas Haas <ffmpeg@haasn.xyz> wrote:
> The worst slowdowns are currently those involving any sort of packed swizzle
> for which there exist dedicated MMX functions currently:
>
> Conversion pass for bgr24 -> abgr:
> [ u8 XXXX -> +++X] SWS_OP_READ : 3 elem(s) packed >> 0
> [ u8 ...X -> X+++] SWS_OP_SWIZZLE : 0012
> [ u8 X... -> ++++] SWS_OP_CLEAR : {255 _ _ _}
> [ u8 .... -> XXXX] SWS_OP_WRITE : 4 elem(s) packed >> 0
> (X = unused, + = exact, 0 = zero)
> bgr24 1920x1080 -> abgr 1920x1080, flags=0 dither=1, SSIM {Y=0.999997 U=0.999989 V=1.000000 A=1.000000}
> time=1710 us, ref=826 us, speedup=0.483x slower
>
> I have previously identified these as a particularly weak spot in the compiler
> output, since no matter what C code I write, the result will always be roughly
> 0.5x compared to the existing hand-written MMX. That said, I also plan on taking
> that existing MMX code and simply plugging it into the new architecture, which
> should get rid of these last few slow cases.
I also wanted to point out that a lot of our conversions are also more
*accurate* than the previous implementations. An illustrative example:
Conversion pass for gray -> gray10le:
[ u8 XXXX -> +XXX] SWS_OP_READ : 1 elem(s) packed >> 0
[ u8 .XXX -> +XXX] SWS_OP_CONVERT : u8 -> f32
[f32 .XXX -> .XXX] SWS_OP_SCALE : * 341/85
[f32 .XXX -> .XXX] SWS_OP_DITHER : 16x16 matrix
[f32 .XXX -> .XXX] SWS_OP_CLAMP : 0 <= x <= {1023 _ _ _}
[f32 .XXX -> +XXX] SWS_OP_CONVERT : f32 -> u16
[u16 .XXX -> XXXX] SWS_OP_WRITE : 1 elem(s) packed >> 0
(X = unused, + = exact, 0 = zero)
gray 1920x1080 -> gray10le 1920x1080, flags=0 dither=1, SSIM {Y=0.999974 U=1.000000 V=1.000000 A=1.000000}
time=1317 us, ref=1300 us, speedup=0.987x slower
The reference implementation handles this as a full range shift:
gray10 = gray << 2 | gray >> 6.
But this is *not* accurate and will therefore introduce round trip error. For
example, a value of 200 produces 200 << 2 | 200 >> 6 = 803, while the correct
result would be 200 / 255 * 1023 = 802.3529411764706. Our new implementation
accurately handles this conversion in floating point math and dithers the
result down to a 35%/65% mix of 802 and 803.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2025-03-09 21:28 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-08 22:53 Niklas Haas
2025-03-09 16:11 ` Martin Storsjö
2025-03-09 19:45 ` Niklas Haas
2025-03-09 18:18 ` Rémi Denis-Courmont
2025-03-09 19:57 ` Niklas Haas
2025-03-10 0:57 ` Rémi Denis-Courmont
2025-03-09 19:41 ` Michael Niedermayer
2025-03-09 21:13 ` Niklas Haas
2025-03-09 21:28 ` Niklas Haas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250309222807.GB706835@haasn.xyz \
--to=ffmpeg@haasn.xyz \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git