Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 00/17] swscale v2: new framework [RFC]
@ 2025-04-26 17:41 Niklas Haas
  2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 01/17] tests/swscale: improve colorization of speedup Niklas Haas
                   ` (19 more replies)
  0 siblings, 20 replies; 33+ messages in thread
From: Niklas Haas @ 2025-04-26 17:41 UTC (permalink / raw)
  To: ffmpeg-devel

Hi all,

After extensive amounts of refactoring and iteration on the design and API,
and the implementation of an x86 SIMD backend, I'm happy to present the
revised version of my ongoing swscale rewrite. Now with 100% less reliance on
compiler autovectorization.

As before, I recommend (re)reading the design document to understand the
motivation, structure and implementation details of this rewrite. At this
point, I expect the major API and internal organization decisions to remain
stable.

I will preface with some benchmark figures, on my (new) AMD Ryzen 9 9950X3D:

All formats:
  - single thread: Overall speedup=2.109x faster, min=0.018x max=40.309x
  - multi thread:  Overall speedup=2.607x faster, min=0.112x max=254.738x

"Common" formats: (referenced >100 times in FFmpeg source code)
  - single thread: Overall speedup=2.797x faster, min=0.408x max=16.514x
  - multi thread:  Overall speedup=2.870x faster, min=0.715x max=21.983x

However, the main goal of this rewrite is not to improve performance, but to
improve the maintainability, extensibility and correctness of the code. Most of
the slowdowns for "common" formats are due to increased correctness (e.g.
accurate rounding and dithering), and not the result of a regression per se.

All of the remaining slowdowns (notably, the 0.1x cases) are due to incomplete
coverage of the x86 SIMD. Notably, this currently affects bit packed formats
(e.g. rgb8, rgb4). (I also did not yet incorporate any AVX-512 code, which
some of the existing routines take advantage of)

While I will continue working on this and expanding coverage to all remaining
operations, I felt that now is a good point in time to get some code review
and feedback regardless. I would especially appreciate code review of the x86
SIMD code inside libswscale/x86/ops_*.asm, as this is my first time writing
x86 assembly code.

 doc/APIchanges                |   3 +
 doc/scaler.texi               |   3 +
 doc/swscale-v2.txt            | 344 +++++++++++++++++++++++++++
 libswscale/Makefile           |   9 +
 libswscale/format.c           | 945 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 libswscale/format.h           |  29 ++-
 libswscale/graph.c            | 151 ++++++++----
 libswscale/graph.h            |  37 ++-
 libswscale/ops.c              | 850 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 libswscale/ops.h              | 263 +++++++++++++++++++++
 libswscale/ops_backend.c      | 101 ++++++++
 libswscale/ops_backend.h      | 181 ++++++++++++++
 libswscale/ops_chain.c        | 291 +++++++++++++++++++++++
 libswscale/ops_chain.h        | 108 +++++++++
 libswscale/ops_internal.h     | 103 ++++++++
 libswscale/ops_optimizer.c    | 810 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 libswscale/ops_tmpl_common.c  | 176 ++++++++++++++
 libswscale/ops_tmpl_float.c   | 255 ++++++++++++++++++++
 libswscale/ops_tmpl_int.c     | 609 +++++++++++++++++++++++++++++++++++++++++++++++
 libswscale/options.c          |   1 +
 libswscale/swscale.h          |   7 +
 libswscale/tests/swscale.c    |  11 +-
 libswscale/version.h          |   2 +-
 libswscale/x86/Makefile       |   3 +
 libswscale/x86/ops.c          | 735 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 libswscale/x86/ops_common.asm | 208 ++++++++++++++++
 libswscale/x86/ops_float.asm  | 376 +++++++++++++++++++++++++++++
 libswscale/x86/ops_int.asm    | 882 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/checkasm/Makefile       |   8 +-
 tests/checkasm/checkasm.c     |   4 +-
 tests/checkasm/checkasm.h     |  26 +-
 tests/checkasm/sw_ops.c       | 748 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 32 files changed, 8206 insertions(+), 73 deletions(-)

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 33+ messages in thread
* [FFmpeg-devel] [PATCH 01/17] swscale/format: rename legacy format conversion table
@ 2025-05-18 14:59 Niklas Haas
  2025-05-18 14:59 ` [FFmpeg-devel] [PATCH 16/17] swscale/format: add new format decode/encode logic Niklas Haas
  0 siblings, 1 reply; 33+ messages in thread
From: Niklas Haas @ 2025-05-18 14:59 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Niklas Haas

From: Niklas Haas <git@haasn.dev>

---
 libswscale/format.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/libswscale/format.c b/libswscale/format.c
index e4c1348b90..b77081dd7a 100644
--- a/libswscale/format.c
+++ b/libswscale/format.c
@@ -24,14 +24,14 @@
 
 #include "format.h"
 
-typedef struct FormatEntry {
+typedef struct LegacyFormatEntry {
     uint8_t is_supported_in         :1;
     uint8_t is_supported_out        :1;
     uint8_t is_supported_endianness :1;
-} FormatEntry;
+} LegacyFormatEntry;
 
 /* Format support table for legacy swscale */
-static const FormatEntry format_entries[] = {
+static const LegacyFormatEntry legacy_format_entries[] = {
     [AV_PIX_FMT_YUV420P]        = { 1, 1 },
     [AV_PIX_FMT_YUYV422]        = { 1, 1 },
     [AV_PIX_FMT_RGB24]          = { 1, 1 },
@@ -262,20 +262,20 @@ static const FormatEntry format_entries[] = {
 
 int sws_isSupportedInput(enum AVPixelFormat pix_fmt)
 {
-    return (unsigned)pix_fmt < FF_ARRAY_ELEMS(format_entries) ?
-           format_entries[pix_fmt].is_supported_in : 0;
+    return (unsigned)pix_fmt < FF_ARRAY_ELEMS(legacy_format_entries) ?
+    legacy_format_entries[pix_fmt].is_supported_in : 0;
 }
 
 int sws_isSupportedOutput(enum AVPixelFormat pix_fmt)
 {
-    return (unsigned)pix_fmt < FF_ARRAY_ELEMS(format_entries) ?
-           format_entries[pix_fmt].is_supported_out : 0;
+    return (unsigned)pix_fmt < FF_ARRAY_ELEMS(legacy_format_entries) ?
+    legacy_format_entries[pix_fmt].is_supported_out : 0;
 }
 
 int sws_isSupportedEndiannessConversion(enum AVPixelFormat pix_fmt)
 {
-    return (unsigned)pix_fmt < FF_ARRAY_ELEMS(format_entries) ?
-           format_entries[pix_fmt].is_supported_endianness : 0;
+    return (unsigned)pix_fmt < FF_ARRAY_ELEMS(legacy_format_entries) ?
+    legacy_format_entries[pix_fmt].is_supported_endianness : 0;
 }
 
 /**
-- 
2.49.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2025-05-18 15:01 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-26 17:41 [FFmpeg-devel] [PATCH 00/17] swscale v2: new framework [RFC] Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 01/17] tests/swscale: improve colorization of speedup Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 02/17] swscale/graph: expose ff_sws_graph_add_pass Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 03/17] swscale/graph: make noop loop more robust Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 04/17] swscale/graph: move vshift() and shift_img() to shared header Niklas Haas
2025-05-16 15:41   ` Ramiro Polla
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 05/17] swscale/graph: prefer bools to ints Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 06/17] doc: add swscale rewrite design document Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 07/17] swscale: add SWS_EXPERIMENTAL flag Niklas Haas
2025-05-08 11:37   ` Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 08/17] swscale/ops: introduce new low level framework Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 09/17] swscale/ops_chain: add internal abstraction for kernel linking Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 10/17] swscale/ops_backend: add reference backend basend on C templates Niklas Haas
2025-05-02 15:06   ` Michael Niedermayer
2025-05-08 12:24     ` Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 11/17] swscale/x86: add SIMD backend Niklas Haas
2025-04-29 13:00   ` Michael Niedermayer
2025-04-30 16:24     ` Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 12/17] tests/checkasm: increase number of runs in between measurements Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 13/17] tests/checkasm: add checkasm_check_float Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 14/17] tests/checkasm: add checkasm tests for swscale ops Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 15/17] swscale/format: rename legacy format conversion table Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 16/17] swscale/format: add new format decode/encode logic Niklas Haas
2025-05-02 14:10   ` Michael Niedermayer
2025-05-02 14:36     ` Niklas Haas
2025-04-26 17:41 ` [FFmpeg-devel] [PATCH 17/17] swscale/graph: allow experimental use of new format handler Niklas Haas
2025-04-26 22:22 ` [FFmpeg-devel] [PATCH 00/17] swscale v2: new framework [RFC] Niklas Haas
2025-05-02 17:51 ` Niklas Haas
2025-05-16 11:09 ` Niklas Haas
2025-05-16 14:32   ` Ramiro Polla
2025-05-16 14:39     ` Niklas Haas
2025-05-16 15:44       ` Ramiro Polla
2025-05-18 14:59 [FFmpeg-devel] [PATCH 01/17] swscale/format: rename legacy format conversion table Niklas Haas
2025-05-18 14:59 ` [FFmpeg-devel] [PATCH 16/17] swscale/format: add new format decode/encode logic Niklas Haas

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git