From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [RFC] New swscale internal design prototype
Date: Sun, 9 Mar 2025 20:41:39 +0100
Message-ID: <20250309194139.GL4991@pb2> (raw)
In-Reply-To: <20250308235342.GB669161@haasn.xyz>
[-- Attachment #1.1: Type: text/plain, Size: 3252 bytes --]
Hi Niklas
On Sat, Mar 08, 2025 at 11:53:42PM +0100, Niklas Haas wrote:
> Hi all,
>
> for the past two months, I have been working on a prototype for a radical
> redesign of the swscale internals, specifically the format handling layer.
> This includes, or will eventually expand to include, all format input/output
> and unscaled special conversion steps.
>
> I am not yet at a point where the new code can replace the scaling kernels,
> but for the time being, we could start usaing it for the simple unscaled cases,
> in theory, right away.
>
> Rather than repeating my entire design document here, I opted to collect my
> notes into a design document on my WIP branch:
>
> https://github.com/haasn/FFmpeg/blob/swscale3/doc/swscale-v2.txt
>
> I have spent the past week or so ironing out the last kinks and extensively
> benchmarking the new design at least on x86, and it is generally a roughly 1.9x
> improvement over the existing unscaled special converters across the board,
> before even adding any hand written ASM. (This speedup is *just* using the
> less-than-optimal compiler output from my reference C code!)
>
> In some cases we even measure ~3-4x or even ~6x speedups, especially those
> where swscale does not currently have hand written SIMD. Overall:
>
> cpu: 16-core AMD Ryzen Threadripper 1950X
> gcc 14.2.1:
> single thread:
> Overall speedup=1.887x faster, min=0.250x max=22.578x
> multi thread:
> Overall speedup=1.657x faster, min=0.190x max=87.972x
>
> (The 0.2x slowdown cases are for rgb8/gbr8 input, which requires LUT support
> for efficient decoding, but I wanted to focus on the core operations first
> before worrying about adding LUT-based optimizations to the design)
>
> I am (almost) ready to begin moving forwards with this design, merging it into
> swscale and using it at least for unscaled format conversions, XYZ decoding,
> colorspace transformations (subsuming the existing, horribly unoptimized,
> 3DLUT layer), gamma transformations, and so on.
>
> I wanted to post it here to gather some feedback on the approach. Where does
> it fall on the "madness" scale? Is the new operations and optimizer design
> comprehensible? Am I trying too hard to reinvent compilers? Are there any
> platforms where the high number of function calls per frame would be
> probitively expensive? What are the thoughts on the float-first approach? See
> also the list of limitations and improvement ideas at the bottom of my design
> document.
I think a more float centric design probably makes sense. Floats make things
nicer and cleaner
It may be needed to support an integer only path for architectures that
have a weak fpu. And also may be needed for some cases to get them bitexact
AVFloating, a rational float type or AVRational64, both interresting.
Do we have other places where either could be used ?
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User
questions about the command line tools should be sent to the ffmpeg-user ML.
And questions about how to use libav* should be sent to the libav-user ML.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2025-03-09 19:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-08 22:53 Niklas Haas
2025-03-09 16:11 ` Martin Storsjö
2025-03-09 19:45 ` Niklas Haas
2025-03-09 18:18 ` Rémi Denis-Courmont
2025-03-09 19:57 ` Niklas Haas
2025-03-10 0:57 ` Rémi Denis-Courmont
2025-03-09 19:41 ` Michael Niedermayer [this message]
2025-03-09 21:13 ` Niklas Haas
2025-03-09 21:28 ` Niklas Haas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250309194139.GL4991@pb2 \
--to=michael@niedermayer.cc \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git