Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [RFC] New swscale internal design prototype
@ 2025-03-08 22:53 Niklas Haas
  2025-03-09 16:11 ` Martin Storsjö
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Niklas Haas @ 2025-03-08 22:53 UTC (permalink / raw)
  To: ffmpeg-devel

Hi all,

for the past two months, I have been working on a prototype for a radical
redesign of the swscale internals, specifically the format handling layer.
This includes, or will eventually expand to include, all format input/output
and unscaled special conversion steps.

I am not yet at a point where the new code can replace the scaling kernels,
but for the time being, we could start usaing it for the simple unscaled cases,
in theory, right away.

Rather than repeating my entire design document here, I opted to collect my
notes into a design document on my WIP branch:

https://github.com/haasn/FFmpeg/blob/swscale3/doc/swscale-v2.txt

I have spent the past week or so ironing out the last kinks and extensively
benchmarking the new design at least on x86, and it is generally a roughly 1.9x
improvement over the existing unscaled special converters across the board,
before even adding any hand written ASM. (This speedup is *just* using the
less-than-optimal compiler output from my reference C code!)

In some cases we even measure ~3-4x or even ~6x speedups, especially those
where swscale does not currently have hand written SIMD. Overall:

cpu: 16-core AMD Ryzen Threadripper 1950X
gcc 14.2.1:
   single thread:
     Overall speedup=1.887x faster, min=0.250x max=22.578x
   multi thread:
     Overall speedup=1.657x faster, min=0.190x max=87.972x

(The 0.2x slowdown cases are for rgb8/gbr8 input, which requires LUT support
 for efficient decoding, but I wanted to focus on the core operations first
 before worrying about adding LUT-based optimizations to the design)

I am (almost) ready to begin moving forwards with this design, merging it into
swscale and using it at least for unscaled format conversions, XYZ decoding,
colorspace transformations (subsuming the existing, horribly unoptimized,
3DLUT layer), gamma transformations, and so on.

I wanted to post it here to gather some feedback on the approach. Where does
it fall on the "madness" scale? Is the new operations and optimizer design
comprehensible? Am I trying too hard to reinvent compilers? Are there any
platforms where the high number of function calls per frame would be
probitively expensive? What are the thoughts on the float-first approach? See
also the list of limitations and improvement ideas at the bottom of my design
document.

Thanks for your time,
Niklas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-03-09 19:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-08 22:53 [FFmpeg-devel] [RFC] New swscale internal design prototype Niklas Haas
2025-03-09 16:11 ` Martin Storsjö
2025-03-09 19:45   ` Niklas Haas
2025-03-09 18:18 ` Rémi Denis-Courmont
2025-03-09 19:57   ` Niklas Haas
2025-03-09 19:41 ` Michael Niedermayer

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git