From: "Martin Storsjö" <martin@martin.st>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Jiawei <jiawei@iscas.ac.cn>
Subject: Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
Date: Thu, 12 Jun 2025 12:05:26 +0300 (EEST)
Message-ID: <9cb1503b-dadf-ef24-9548-43c85c42914@martin.st> (raw)
In-Reply-To: <tencent_19AD25579317084393EB337CB5C109114E05@qq.com>
On Thu, 29 May 2025, Zhao Zhili wrote:
>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>
> It looks like the patch format is corrupted.
>
> I’m OK with the code change. However, the commit message is misleading. As already pointed out
> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
> they are available at runtime, unless build and run on a particular hardware.
I'm also ok with the code change in itself, but I would also prefer not to
advertise or motivate the change with non-default instruction sets like
AVX, SVE and RVV. (For instruction sets in the base architecture sets,
like NEON, it can be useful though.)
It would also be good to mention previous attempts to do the same, which
was done in 2016 in cb8646af24bd8e9627cc5e1c62b049a00fe0b07b and reverted
in fd6dbc53855fbfc9a782095d0ffe11dd3a98905f. The issues that were noticed
at that point were relating to the complicated inline x86 cabac assembly,
which nearly exhausts all available registers. In
182663a58a7a099e02e76da3b0f96d63e5c26a6d (in 2023) this function was made
non-inline, so the issues with exhausting registers shouldn't affect other
functions so much. So this should be essential to mention, as to why we
hope this attempt will work better this time, compared to last time.
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2025-06-12 9:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-29 7:03 Jiawei
2025-05-29 8:37 ` Zhao Zhili
2025-05-29 10:20 ` Jiawei
2025-05-29 10:53 ` Zhao Zhili
2025-05-29 13:35 ` Frank Plowman
2025-05-29 16:02 ` Michael Niedermayer
2025-05-29 17:26 ` Andreas Rheinhardt
2025-05-29 19:06 ` Michael Niedermayer
2025-05-30 7:36 ` Rémi Denis-Courmont
2025-06-12 9:05 ` Martin Storsjö [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9cb1503b-dadf-ef24-9548-43c85c42914@martin.st \
--to=martin@martin.st \
--cc=ffmpeg-devel@ffmpeg.org \
--cc=jiawei@iscas.ac.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git