* [FFmpeg-devel] [PR] avcodec/bswapdsp: improve performance by remove manually unroll (PR #21427)
@ 2026-01-10 14:34 Zhao Zhili via ffmpeg-devel
0 siblings, 0 replies; only message in thread
From: Zhao Zhili via ffmpeg-devel @ 2026-01-10 14:34 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Zhao Zhili
PR #21427 opened by Zhao Zhili (quink)
URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21427
Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21427.patch
Manually unrolling loops increases code size, which can sometimes
improve performance, but more often than not, it degrades performance.
Keep the C version simple, and add assembly optimizations when needed.
x86-clang x86-gcc-arch-native x86-msvc m1-clang rpi5-clang pi5-gcc-14
-------------------------------------------------------------------------------------------------------------
bswap_buf_c 57.3 ( 1.00x) 19.4 ( 1.00x) 55.4 ( 1.00x) 0.5 ( 1.00x) 143.5 ( 1.00x) 59.8 ( 1.00x)
bswap_buf_this* 49.0 ( 1.17x) 12.5 ( 1.56x) 17.7 ( 3.13x) 0.3 ( 2.04x) 57.9 ( 2.48x) 73.5 ( 0.81x)
bswap_buf_sse2 28.4 ( 2.02x) 24.3 ( 0.80x) 25.5 ( 2.18x) - - -
bswap_buf_ssse3 24.6 ( 2.32x) 16.0 ( 1.22x) 19.0 ( 2.92x) - - -
bswap_buf_avx2 21.2 ( 2.70x) 11.1 ( 1.74x) 11.2 ( 4.95x) - - -
bswap_buf_c: C implementation before this patch
bswap_buf_this: C implementation after this patch
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
>From e41f8e1ac5984bc9ba7abc21fcf49fc55b666e93 Mon Sep 17 00:00:00 2001
From: Zhao Zhili <zhilizhao@tencent.com>
Date: Sat, 10 Jan 2026 22:07:16 +0800
Subject: [PATCH] avcodec/bswapdsp: improve performance by remove manually
unroll
Manually unrolling loops increases code size, which can sometimes
improve performance, but more often than not, it degrades performance.
Keep the C version simple, and add assembly optimizations when needed.
x86-clang x86-gcc-arch-native x86-msvc m1-clang rpi5-clang pi5-gcc-14
-------------------------------------------------------------------------------------------------------------
bswap_buf_c 57.3 ( 1.00x) 19.4 ( 1.00x) 55.4 ( 1.00x) 0.5 ( 1.00x) 143.5 ( 1.00x) 59.8 ( 1.00x)
bswap_buf_this* 49.0 ( 1.17x) 12.5 ( 1.56x) 17.7 ( 3.13x) 0.3 ( 2.04x) 57.9 ( 2.48x) 73.5 ( 0.81x)
bswap_buf_sse2 28.4 ( 2.02x) 24.3 ( 0.80x) 25.5 ( 2.18x) - - -
bswap_buf_ssse3 24.6 ( 2.32x) 16.0 ( 1.22x) 19.0 ( 2.92x) - - -
bswap_buf_avx2 21.2 ( 2.70x) 11.1 ( 1.74x) 11.2 ( 4.95x) - - -
bswap_buf_c: C implementation before this patch
bswap_buf_this: C implementation after this patch
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
---
libavcodec/bswapdsp.c | 14 +-------------
1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/libavcodec/bswapdsp.c b/libavcodec/bswapdsp.c
index f375ab79ac..266aeca44a 100644
--- a/libavcodec/bswapdsp.c
+++ b/libavcodec/bswapdsp.c
@@ -24,19 +24,7 @@
static void bswap_buf(uint32_t *dst, const uint32_t *src, int w)
{
- int i;
-
- for (i = 0; i + 8 <= w; i += 8) {
- dst[i + 0] = av_bswap32(src[i + 0]);
- dst[i + 1] = av_bswap32(src[i + 1]);
- dst[i + 2] = av_bswap32(src[i + 2]);
- dst[i + 3] = av_bswap32(src[i + 3]);
- dst[i + 4] = av_bswap32(src[i + 4]);
- dst[i + 5] = av_bswap32(src[i + 5]);
- dst[i + 6] = av_bswap32(src[i + 6]);
- dst[i + 7] = av_bswap32(src[i + 7]);
- }
- for (; i < w; i++)
+ for (int i = 0; i < w; i++)
dst[i + 0] = av_bswap32(src[i + 0]);
}
--
2.49.1
_______________________________________________
ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org
To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-01-10 14:35 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-10 14:34 [FFmpeg-devel] [PR] avcodec/bswapdsp: improve performance by remove manually unroll (PR #21427) Zhao Zhili via ffmpeg-devel
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git