From: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation. Date: Thu, 29 May 2025 19:26:09 +0200 Message-ID: <GV1P250MB0737D7B919000462960BB9DE8F66A@GV1P250MB0737.EURP250.PROD.OUTLOOK.COM> (raw) In-Reply-To: <20250529160224.GB29660@pb2> Michael Niedermayer: > Hi > > On Thu, May 29, 2025 at 04:37:16PM +0800, Zhao Zhili wrote: >> >> >>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote: >>> >>> This patch modifies the FFmpeg build system to remove the explicit disabling >>> of GCC's auto-vectorization feature. >>> >>> Modern GCC versions have demonstrated stable auto-vectorization capabilities >>> through extensive optimizations in loop analysis and SIMD code generation. >>> The explicit -fno-tree-vectorize flag originally added in commit 973859f >>> (2009) to workaround early GCC vectorization instability is no longer >>> necessary for recent gcc versions. >>> >>> Key improvements justifying this change: >>> 1. Enhanced heuristics for loop vectorization cost models >>> 2. Mature handling of alignment and memory access patterns >>> 3. Robust fallback mechanisms for unsupported architectures >>> >>> This change allows FFmpeg to benefit from automated SIMD optimizations >>> when built with -O3 optimization level, particularly improving >>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures. >>> >>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191 >>> >>> Version log: >>> Only allow GCC versions >= 13 to use auto-vectorization. >>> Disscussion see: >>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/ >>> >>> --- >>> configure | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn> >>> --- >>> configure | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/configure b/configure >>> index 3730b0524c..91e3e107c2 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -7656,7 +7656,11 @@ if enabled icc; then >>> disable aligned_stack >>> fi >>> elif enabled gcc; then >>> - check_optflags -fno-tree-vectorize >>> + gcc_version=$($cc -dumpversion) >>> + major_version=${gcc_version%%.*} >>> + if [ $major_version -lt 13 ]; then >>> + check_optflags -fno-tree-vectorize >>> + fi >>> check_cflags -Werror=format-security >>> check_cflags -Werror=implicit-function-declaration >>> check_cflags -Werror=missing-prototypes >>> -- >>> 2.43.0 >>> >>> This patch modifies the FFmpeg build system to remove the explicit disabling >>> of GCC's auto-vectorization feature. >>> >>> Modern GCC versions have demonstrated stable auto-vectorization capabilities >>> through extensive optimizations in loop analysis and SIMD code generation. >>> The explicit -fno-tree-vectorize flag originally added in commit 973859f >>> (2009) to workaround early GCC vectorization instability is no longer >>> necessary for recent gcc versions. >>> >>> Key improvements justifying this change: >>> 1. Enhanced heuristics for loop vectorization cost models >>> 2. Mature handling of alignment and memory access patterns >>> 3. Robust fallback mechanisms for unsupported architectures >>> >>> This change allows FFmpeg to benefit from automated SIMD optimizations >>> when built with -O3 optimization level, particularly improving >>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures. >>> >>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191 >>> >>> Version log: >>> Only allow GCC versions >= 13 to use auto-vectorization. >>> Disscussion see: >>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/ >>> >>> --- >>> configure | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn> >>> --- >>> configure | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/configure b/configure >>> index 3730b0524c..91e3e107c2 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -7656,7 +7656,11 @@ if enabled icc; then >>> disable aligned_stack >>> fi >>> elif enabled gcc; then >>> - check_optflags -fno-tree-vectorize >>> + gcc_version=$($cc -dumpversion) >>> + major_version=${gcc_version%%.*} >>> + if [ $major_version -lt 13 ]; then >>> + check_optflags -fno-tree-vectorize >>> + fi >>> check_cflags -Werror=format-security >>> check_cflags -Werror=implicit-function-declaration >>> check_cflags -Werror=missing-prototypes >>> -- >>> 2.43.0 >>> >>> This patch modifies the FFmpeg build system to remove the explicit disabling >>> of GCC's auto-vectorization feature. >>> >>> Modern GCC versions have demonstrated stable auto-vectorization capabilities >>> through extensive optimizations in loop analysis and SIMD code generation. >>> The explicit -fno-tree-vectorize flag originally added in commit 973859f >>> (2009) to workaround early GCC vectorization instability is no longer >>> necessary for recent gcc versions. >>> >>> Key improvements justifying this change: >>> 1. Enhanced heuristics for loop vectorization cost models >>> 2. Mature handling of alignment and memory access patterns >>> 3. Robust fallback mechanisms for unsupported architectures >>> >>> This change allows FFmpeg to benefit from automated SIMD optimizations >>> when built with -O3 optimization level, particularly improving >>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures. >>> >>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191 >>> >>> Version log: >>> Only allow GCC versions >= 13 to use auto-vectorization. >>> Disscussion see: >>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/ >>> >>> --- >>> configure | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn> >>> --- >>> configure | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/configure b/configure >>> index 3730b0524c..91e3e107c2 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -7656,7 +7656,11 @@ if enabled icc; then >>> disable aligned_stack >>> fi >>> elif enabled gcc; then >>> - check_optflags -fno-tree-vectorize >>> + gcc_version=$($cc -dumpversion) >>> + major_version=${gcc_version%%.*} >>> + if [ $major_version -lt 13 ]; then >>> + check_optflags -fno-tree-vectorize >>> + fi >>> check_cflags -Werror=format-security >>> check_cflags -Werror=implicit-function-declaration >>> check_cflags -Werror=missing-prototypes >>> -- >>> 2.43.0 >>> >> >> It looks like the patch format is corrupted. >> >> I’m OK with the code change. However, the commit message is misleading. As already pointed out >> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume >> they are available at runtime, unless build and run on a particular hardware. > > can gcc or clang not build code like our runtime cpudetect ? > i mean build functions for each major type and detect cpu once > and switch accordingly ? How would this "once" work in practice? If the cpu is supposed to be detected only once, the result needs to be stored somewhere (in static storage). Even if this is initialized in the libraries .init function (so that we can be sure that it is initialized when the actual code is run), this would still need a check every time one of these code snippets is run. > > I cannot be the first person thinking of that > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2025-05-29 17:26 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-05-29 7:03 Jiawei 2025-05-29 8:37 ` Zhao Zhili 2025-05-29 10:20 ` Jiawei 2025-05-29 10:53 ` Zhao Zhili 2025-05-29 13:35 ` Frank Plowman 2025-05-29 16:02 ` Michael Niedermayer 2025-05-29 17:26 ` Andreas Rheinhardt [this message] 2025-05-29 19:06 ` Michael Niedermayer 2025-05-30 7:36 ` Rémi Denis-Courmont 2025-06-12 9:05 ` Martin Storsjö
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=GV1P250MB0737D7B919000462960BB9DE8F66A@GV1P250MB0737.EURP250.PROD.OUTLOOK.COM \ --to=andreas.rheinhardt@outlook.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git