Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [RFC] New swscale internal design prototype
Date: Sun, 9 Mar 2025 20:41:39 +0100
Message-ID: <20250309194139.GL4991@pb2> (raw)
In-Reply-To: <20250308235342.GB669161@haasn.xyz>


[-- Attachment #1.1: Type: text/plain, Size: 3252 bytes --]

Hi Niklas

On Sat, Mar 08, 2025 at 11:53:42PM +0100, Niklas Haas wrote:
> Hi all,
> 
> for the past two months, I have been working on a prototype for a radical
> redesign of the swscale internals, specifically the format handling layer.
> This includes, or will eventually expand to include, all format input/output
> and unscaled special conversion steps.
> 
> I am not yet at a point where the new code can replace the scaling kernels,
> but for the time being, we could start usaing it for the simple unscaled cases,
> in theory, right away.
> 
> Rather than repeating my entire design document here, I opted to collect my
> notes into a design document on my WIP branch:
> 
> https://github.com/haasn/FFmpeg/blob/swscale3/doc/swscale-v2.txt
> 
> I have spent the past week or so ironing out the last kinks and extensively
> benchmarking the new design at least on x86, and it is generally a roughly 1.9x
> improvement over the existing unscaled special converters across the board,
> before even adding any hand written ASM. (This speedup is *just* using the
> less-than-optimal compiler output from my reference C code!)
> 
> In some cases we even measure ~3-4x or even ~6x speedups, especially those
> where swscale does not currently have hand written SIMD. Overall:
> 
> cpu: 16-core AMD Ryzen Threadripper 1950X
> gcc 14.2.1:
>    single thread:
>      Overall speedup=1.887x faster, min=0.250x max=22.578x
>    multi thread:
>      Overall speedup=1.657x faster, min=0.190x max=87.972x
> 
> (The 0.2x slowdown cases are for rgb8/gbr8 input, which requires LUT support
>  for efficient decoding, but I wanted to focus on the core operations first
>  before worrying about adding LUT-based optimizations to the design)
> 
> I am (almost) ready to begin moving forwards with this design, merging it into
> swscale and using it at least for unscaled format conversions, XYZ decoding,
> colorspace transformations (subsuming the existing, horribly unoptimized,
> 3DLUT layer), gamma transformations, and so on.
> 
> I wanted to post it here to gather some feedback on the approach. Where does
> it fall on the "madness" scale? Is the new operations and optimizer design
> comprehensible? Am I trying too hard to reinvent compilers? Are there any
> platforms where the high number of function calls per frame would be
> probitively expensive? What are the thoughts on the float-first approach? See
> also the list of limitations and improvement ideas at the bottom of my design
> document.

I think a more float centric design probably makes sense. Floats make things
nicer and cleaner
It may be needed to support an integer only path for architectures that
have a weak fpu. And also may be needed for some cases to get them bitexact

AVFloating, a rational float type or AVRational64, both interresting.
Do we have other places where either could be used ?

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User
questions about the command line tools should be sent to the ffmpeg-user ML.
And questions about how to use libav* should be sent to the libav-user ML.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  parent reply	other threads:[~2025-03-09 19:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-08 22:53 Niklas Haas
2025-03-09 16:11 ` Martin Storsjö
2025-03-09 19:45   ` Niklas Haas
2025-03-09 18:18 ` Rémi Denis-Courmont
2025-03-09 19:57   ` Niklas Haas
2025-03-10  0:57     ` Rémi Denis-Courmont
2025-03-09 19:41 ` Michael Niedermayer [this message]
2025-03-09 21:13 ` Niklas Haas
2025-03-09 21:28   ` Niklas Haas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250309194139.GL4991@pb2 \
    --to=michael@niedermayer.cc \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git