* [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
@ 2025-05-29 7:03 Jiawei
2025-05-29 8:37 ` Zhao Zhili
0 siblings, 1 reply; 10+ messages in thread
From: Jiawei @ 2025-05-29 7:03 UTC (permalink / raw)
To: ffmpeg-devel
Cc: michael, george, remi, post, quinkblack, Jiawei, martin,
kieran618, andreas.rheinhardt
This patch modifies the FFmpeg build system to remove the explicit disabling
of GCC's auto-vectorization feature.
Modern GCC versions have demonstrated stable auto-vectorization capabilities
through extensive optimizations in loop analysis and SIMD code generation.
The explicit -fno-tree-vectorize flag originally added in commit 973859f
(2009) to workaround early GCC vectorization instability is no longer
necessary for recent gcc versions.
Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures
This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
[1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
Version log:
Only allow GCC versions >= 13 to use auto-vectorization.
Disscussion see:
https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
---
configure | 1 -
1 file changed, 1 deletion(-)
Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
---
configure | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/configure b/configure
index 3730b0524c..91e3e107c2 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,11 @@ if enabled icc; then
disable aligned_stack
fi
elif enabled gcc; then
- check_optflags -fno-tree-vectorize
+ gcc_version=$($cc -dumpversion)
+ major_version=${gcc_version%%.*}
+ if [ $major_version -lt 13 ]; then
+ check_optflags -fno-tree-vectorize
+ fi
check_cflags -Werror=format-security
check_cflags -Werror=implicit-function-declaration
check_cflags -Werror=missing-prototypes
--
2.43.0
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 7:03 [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation Jiawei
@ 2025-05-29 8:37 ` Zhao Zhili
2025-05-29 10:20 ` Jiawei
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Zhao Zhili @ 2025-05-29 8:37 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Jiawei
> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>
> This patch modifies the FFmpeg build system to remove the explicit disabling
> of GCC's auto-vectorization feature.
>
> Modern GCC versions have demonstrated stable auto-vectorization capabilities
> through extensive optimizations in loop analysis and SIMD code generation.
> The explicit -fno-tree-vectorize flag originally added in commit 973859f
> (2009) to workaround early GCC vectorization instability is no longer
> necessary for recent gcc versions.
>
> Key improvements justifying this change:
> 1. Enhanced heuristics for loop vectorization cost models
> 2. Mature handling of alignment and memory access patterns
> 3. Robust fallback mechanisms for unsupported architectures
>
> This change allows FFmpeg to benefit from automated SIMD optimizations
> when built with -O3 optimization level, particularly improving
> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>
> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>
> Version log:
> Only allow GCC versions >= 13 to use auto-vectorization.
> Disscussion see:
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>
> ---
> configure | 1 -
> 1 file changed, 1 deletion(-)
>
> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> ---
> configure | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/configure b/configure
> index 3730b0524c..91e3e107c2 100755
> --- a/configure
> +++ b/configure
> @@ -7656,7 +7656,11 @@ if enabled icc; then
> disable aligned_stack
> fi
> elif enabled gcc; then
> - check_optflags -fno-tree-vectorize
> + gcc_version=$($cc -dumpversion)
> + major_version=${gcc_version%%.*}
> + if [ $major_version -lt 13 ]; then
> + check_optflags -fno-tree-vectorize
> + fi
> check_cflags -Werror=format-security
> check_cflags -Werror=implicit-function-declaration
> check_cflags -Werror=missing-prototypes
> --
> 2.43.0
>
> This patch modifies the FFmpeg build system to remove the explicit disabling
> of GCC's auto-vectorization feature.
>
> Modern GCC versions have demonstrated stable auto-vectorization capabilities
> through extensive optimizations in loop analysis and SIMD code generation.
> The explicit -fno-tree-vectorize flag originally added in commit 973859f
> (2009) to workaround early GCC vectorization instability is no longer
> necessary for recent gcc versions.
>
> Key improvements justifying this change:
> 1. Enhanced heuristics for loop vectorization cost models
> 2. Mature handling of alignment and memory access patterns
> 3. Robust fallback mechanisms for unsupported architectures
>
> This change allows FFmpeg to benefit from automated SIMD optimizations
> when built with -O3 optimization level, particularly improving
> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>
> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>
> Version log:
> Only allow GCC versions >= 13 to use auto-vectorization.
> Disscussion see:
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>
> ---
> configure | 1 -
> 1 file changed, 1 deletion(-)
>
> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> ---
> configure | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/configure b/configure
> index 3730b0524c..91e3e107c2 100755
> --- a/configure
> +++ b/configure
> @@ -7656,7 +7656,11 @@ if enabled icc; then
> disable aligned_stack
> fi
> elif enabled gcc; then
> - check_optflags -fno-tree-vectorize
> + gcc_version=$($cc -dumpversion)
> + major_version=${gcc_version%%.*}
> + if [ $major_version -lt 13 ]; then
> + check_optflags -fno-tree-vectorize
> + fi
> check_cflags -Werror=format-security
> check_cflags -Werror=implicit-function-declaration
> check_cflags -Werror=missing-prototypes
> --
> 2.43.0
>
> This patch modifies the FFmpeg build system to remove the explicit disabling
> of GCC's auto-vectorization feature.
>
> Modern GCC versions have demonstrated stable auto-vectorization capabilities
> through extensive optimizations in loop analysis and SIMD code generation.
> The explicit -fno-tree-vectorize flag originally added in commit 973859f
> (2009) to workaround early GCC vectorization instability is no longer
> necessary for recent gcc versions.
>
> Key improvements justifying this change:
> 1. Enhanced heuristics for loop vectorization cost models
> 2. Mature handling of alignment and memory access patterns
> 3. Robust fallback mechanisms for unsupported architectures
>
> This change allows FFmpeg to benefit from automated SIMD optimizations
> when built with -O3 optimization level, particularly improving
> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>
> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>
> Version log:
> Only allow GCC versions >= 13 to use auto-vectorization.
> Disscussion see:
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>
> ---
> configure | 1 -
> 1 file changed, 1 deletion(-)
>
> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> ---
> configure | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/configure b/configure
> index 3730b0524c..91e3e107c2 100755
> --- a/configure
> +++ b/configure
> @@ -7656,7 +7656,11 @@ if enabled icc; then
> disable aligned_stack
> fi
> elif enabled gcc; then
> - check_optflags -fno-tree-vectorize
> + gcc_version=$($cc -dumpversion)
> + major_version=${gcc_version%%.*}
> + if [ $major_version -lt 13 ]; then
> + check_optflags -fno-tree-vectorize
> + fi
> check_cflags -Werror=format-security
> check_cflags -Werror=implicit-function-declaration
> check_cflags -Werror=missing-prototypes
> --
> 2.43.0
>
It looks like the patch format is corrupted.
I’m OK with the code change. However, the commit message is misleading. As already pointed out
by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
they are available at runtime, unless build and run on a particular hardware.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 8:37 ` Zhao Zhili
@ 2025-05-29 10:20 ` Jiawei
2025-05-29 10:53 ` Zhao Zhili
2025-05-29 13:35 ` Frank Plowman
2025-05-29 16:02 ` Michael Niedermayer
2025-06-12 9:05 ` Martin Storsjö
2 siblings, 2 replies; 10+ messages in thread
From: Jiawei @ 2025-05-29 10:20 UTC (permalink / raw)
To: Zhao Zhili, ffmpeg-devel
在 2025/5/29 16:37, Zhao Zhili 写道:
>
>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
> It looks like the patch format is corrupted.
Sorry, I don't know how this happened.
>
> I’m OK with the code change. However, the commit message is misleading. As already pointed out
> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
> they are available at runtime, unless build and run on a particular hardware.
Do you think change it into 'gcc: Allow `-fno-tree-vectorize` when gcc
version greater than 13.' will be better?
BR,
Jiawei
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 10:20 ` Jiawei
@ 2025-05-29 10:53 ` Zhao Zhili
2025-05-29 13:35 ` Frank Plowman
1 sibling, 0 replies; 10+ messages in thread
From: Zhao Zhili @ 2025-05-29 10:53 UTC (permalink / raw)
To: FFmpeg development discussions and patches
> On May 29, 2025, at 18:20, Jiawei <jiawei@iscas.ac.cn> wrote:
>
>
> 在 2025/5/29 16:37, Zhao Zhili 写道:
>>
>>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>> It looks like the patch format is corrupted.
> Sorry, I don't know how this happened.
>>
>> I’m OK with the code change. However, the commit message is misleading. As already pointed out
>> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
>> they are available at runtime, unless build and run on a particular hardware.
>
> Do you think change it into 'gcc: Allow `-fno-tree-vectorize` when gcc version greater than 13.' will be better?
Allow `-fno-tree-vectorize` => Allow vectorization
>
>
> BR,
>
> Jiawei
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 10:20 ` Jiawei
2025-05-29 10:53 ` Zhao Zhili
@ 2025-05-29 13:35 ` Frank Plowman
1 sibling, 0 replies; 10+ messages in thread
From: Frank Plowman @ 2025-05-29 13:35 UTC (permalink / raw)
To: ffmpeg-devel
On 29/05/2025 11:20, Jiawei wrote:
>
> 在 2025/5/29 16:37, Zhao Zhili 写道:
>>
>>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>> It looks like the patch format is corrupted.
> Sorry, I don't know how this happened.
>>
>> I’m OK with the code change. However, the commit message is misleading. As already pointed out
>> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
>> they are available at runtime, unless build and run on a particular hardware.
>
> Do you think change it into 'gcc: Allow `-fno-tree-vectorize` when gcc
> version greater than 13.' will be better?
>
I don't think it's the header line (gcc: Relaxing auto-vectorization
limitation.) that's being called problematic, though maybe the full stop
could be removed. Moreso it's the body text beneath it: "particularly
improving performance on x86_64 (AVX)..." Note that when using git
send-email, lines below the subject line will also become part of the
commit message. Specifically, all text before the first line that
begins --- will form the commit message, so if you wish to add extra
text to your email but not to your commit message, you can insert it
after the first ---.
--
Frank
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 8:37 ` Zhao Zhili
2025-05-29 10:20 ` Jiawei
@ 2025-05-29 16:02 ` Michael Niedermayer
2025-05-29 17:26 ` Andreas Rheinhardt
2025-05-30 7:36 ` Rémi Denis-Courmont
2025-06-12 9:05 ` Martin Storsjö
2 siblings, 2 replies; 10+ messages in thread
From: Michael Niedermayer @ 2025-05-29 16:02 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 7527 bytes --]
Hi
On Thu, May 29, 2025 at 04:37:16PM +0800, Zhao Zhili wrote:
>
>
> > On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
> >
> > This patch modifies the FFmpeg build system to remove the explicit disabling
> > of GCC's auto-vectorization feature.
> >
> > Modern GCC versions have demonstrated stable auto-vectorization capabilities
> > through extensive optimizations in loop analysis and SIMD code generation.
> > The explicit -fno-tree-vectorize flag originally added in commit 973859f
> > (2009) to workaround early GCC vectorization instability is no longer
> > necessary for recent gcc versions.
> >
> > Key improvements justifying this change:
> > 1. Enhanced heuristics for loop vectorization cost models
> > 2. Mature handling of alignment and memory access patterns
> > 3. Robust fallback mechanisms for unsupported architectures
> >
> > This change allows FFmpeg to benefit from automated SIMD optimizations
> > when built with -O3 optimization level, particularly improving
> > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> >
> > [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> >
> > Version log:
> > Only allow GCC versions >= 13 to use auto-vectorization.
> > Disscussion see:
> > https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
> >
> > ---
> > configure | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> > ---
> > configure | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/configure b/configure
> > index 3730b0524c..91e3e107c2 100755
> > --- a/configure
> > +++ b/configure
> > @@ -7656,7 +7656,11 @@ if enabled icc; then
> > disable aligned_stack
> > fi
> > elif enabled gcc; then
> > - check_optflags -fno-tree-vectorize
> > + gcc_version=$($cc -dumpversion)
> > + major_version=${gcc_version%%.*}
> > + if [ $major_version -lt 13 ]; then
> > + check_optflags -fno-tree-vectorize
> > + fi
> > check_cflags -Werror=format-security
> > check_cflags -Werror=implicit-function-declaration
> > check_cflags -Werror=missing-prototypes
> > --
> > 2.43.0
> >
> > This patch modifies the FFmpeg build system to remove the explicit disabling
> > of GCC's auto-vectorization feature.
> >
> > Modern GCC versions have demonstrated stable auto-vectorization capabilities
> > through extensive optimizations in loop analysis and SIMD code generation.
> > The explicit -fno-tree-vectorize flag originally added in commit 973859f
> > (2009) to workaround early GCC vectorization instability is no longer
> > necessary for recent gcc versions.
> >
> > Key improvements justifying this change:
> > 1. Enhanced heuristics for loop vectorization cost models
> > 2. Mature handling of alignment and memory access patterns
> > 3. Robust fallback mechanisms for unsupported architectures
> >
> > This change allows FFmpeg to benefit from automated SIMD optimizations
> > when built with -O3 optimization level, particularly improving
> > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> >
> > [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> >
> > Version log:
> > Only allow GCC versions >= 13 to use auto-vectorization.
> > Disscussion see:
> > https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
> >
> > ---
> > configure | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> > ---
> > configure | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/configure b/configure
> > index 3730b0524c..91e3e107c2 100755
> > --- a/configure
> > +++ b/configure
> > @@ -7656,7 +7656,11 @@ if enabled icc; then
> > disable aligned_stack
> > fi
> > elif enabled gcc; then
> > - check_optflags -fno-tree-vectorize
> > + gcc_version=$($cc -dumpversion)
> > + major_version=${gcc_version%%.*}
> > + if [ $major_version -lt 13 ]; then
> > + check_optflags -fno-tree-vectorize
> > + fi
> > check_cflags -Werror=format-security
> > check_cflags -Werror=implicit-function-declaration
> > check_cflags -Werror=missing-prototypes
> > --
> > 2.43.0
> >
> > This patch modifies the FFmpeg build system to remove the explicit disabling
> > of GCC's auto-vectorization feature.
> >
> > Modern GCC versions have demonstrated stable auto-vectorization capabilities
> > through extensive optimizations in loop analysis and SIMD code generation.
> > The explicit -fno-tree-vectorize flag originally added in commit 973859f
> > (2009) to workaround early GCC vectorization instability is no longer
> > necessary for recent gcc versions.
> >
> > Key improvements justifying this change:
> > 1. Enhanced heuristics for loop vectorization cost models
> > 2. Mature handling of alignment and memory access patterns
> > 3. Robust fallback mechanisms for unsupported architectures
> >
> > This change allows FFmpeg to benefit from automated SIMD optimizations
> > when built with -O3 optimization level, particularly improving
> > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> >
> > [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> >
> > Version log:
> > Only allow GCC versions >= 13 to use auto-vectorization.
> > Disscussion see:
> > https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
> >
> > ---
> > configure | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> > ---
> > configure | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/configure b/configure
> > index 3730b0524c..91e3e107c2 100755
> > --- a/configure
> > +++ b/configure
> > @@ -7656,7 +7656,11 @@ if enabled icc; then
> > disable aligned_stack
> > fi
> > elif enabled gcc; then
> > - check_optflags -fno-tree-vectorize
> > + gcc_version=$($cc -dumpversion)
> > + major_version=${gcc_version%%.*}
> > + if [ $major_version -lt 13 ]; then
> > + check_optflags -fno-tree-vectorize
> > + fi
> > check_cflags -Werror=format-security
> > check_cflags -Werror=implicit-function-declaration
> > check_cflags -Werror=missing-prototypes
> > --
> > 2.43.0
> >
>
> It looks like the patch format is corrupted.
>
> I’m OK with the code change. However, the commit message is misleading. As already pointed out
> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
> they are available at runtime, unless build and run on a particular hardware.
can gcc or clang not build code like our runtime cpudetect ?
i mean build functions for each major type and detect cpu once
and switch accordingly ?
I cannot be the first person thinking of that
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 16:02 ` Michael Niedermayer
@ 2025-05-29 17:26 ` Andreas Rheinhardt
2025-05-29 19:06 ` Michael Niedermayer
2025-05-30 7:36 ` Rémi Denis-Courmont
1 sibling, 1 reply; 10+ messages in thread
From: Andreas Rheinhardt @ 2025-05-29 17:26 UTC (permalink / raw)
To: ffmpeg-devel
Michael Niedermayer:
> Hi
>
> On Thu, May 29, 2025 at 04:37:16PM +0800, Zhao Zhili wrote:
>>
>>
>>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>> This patch modifies the FFmpeg build system to remove the explicit disabling
>>> of GCC's auto-vectorization feature.
>>>
>>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>>> through extensive optimizations in loop analysis and SIMD code generation.
>>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>>> (2009) to workaround early GCC vectorization instability is no longer
>>> necessary for recent gcc versions.
>>>
>>> Key improvements justifying this change:
>>> 1. Enhanced heuristics for loop vectorization cost models
>>> 2. Mature handling of alignment and memory access patterns
>>> 3. Robust fallback mechanisms for unsupported architectures
>>>
>>> This change allows FFmpeg to benefit from automated SIMD optimizations
>>> when built with -O3 optimization level, particularly improving
>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>>
>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>>
>>> Version log:
>>> Only allow GCC versions >= 13 to use auto-vectorization.
>>> Disscussion see:
>>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>>
>>> ---
>>> configure | 1 -
>>> 1 file changed, 1 deletion(-)
>>>
>>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>>> ---
>>> configure | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 3730b0524c..91e3e107c2 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>>> disable aligned_stack
>>> fi
>>> elif enabled gcc; then
>>> - check_optflags -fno-tree-vectorize
>>> + gcc_version=$($cc -dumpversion)
>>> + major_version=${gcc_version%%.*}
>>> + if [ $major_version -lt 13 ]; then
>>> + check_optflags -fno-tree-vectorize
>>> + fi
>>> check_cflags -Werror=format-security
>>> check_cflags -Werror=implicit-function-declaration
>>> check_cflags -Werror=missing-prototypes
>>> --
>>> 2.43.0
>>>
>>
>> It looks like the patch format is corrupted.
>>
>> I’m OK with the code change. However, the commit message is misleading. As already pointed out
>> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
>> they are available at runtime, unless build and run on a particular hardware.
>
> can gcc or clang not build code like our runtime cpudetect ?
> i mean build functions for each major type and detect cpu once
> and switch accordingly ?
How would this "once" work in practice? If the cpu is supposed to be
detected only once, the result needs to be stored somewhere (in static
storage). Even if this is initialized in the libraries .init function
(so that we can be sure that it is initialized when the actual code is
run), this would still need a check every time one of these code
snippets is run.
>
> I cannot be the first person thinking of that
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 17:26 ` Andreas Rheinhardt
@ 2025-05-29 19:06 ` Michael Niedermayer
0 siblings, 0 replies; 10+ messages in thread
From: Michael Niedermayer @ 2025-05-29 19:06 UTC (permalink / raw)
To: FFmpeg development discussions and patches
[-- Attachment #1.1: Type: text/plain, Size: 8334 bytes --]
On Thu, May 29, 2025 at 07:26:09PM +0200, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > Hi
> >
> > On Thu, May 29, 2025 at 04:37:16PM +0800, Zhao Zhili wrote:
> >>
> >>
> >>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
> >>>
> >>> This patch modifies the FFmpeg build system to remove the explicit disabling
> >>> of GCC's auto-vectorization feature.
> >>>
> >>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
> >>> through extensive optimizations in loop analysis and SIMD code generation.
> >>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
> >>> (2009) to workaround early GCC vectorization instability is no longer
> >>> necessary for recent gcc versions.
> >>>
> >>> Key improvements justifying this change:
> >>> 1. Enhanced heuristics for loop vectorization cost models
> >>> 2. Mature handling of alignment and memory access patterns
> >>> 3. Robust fallback mechanisms for unsupported architectures
> >>>
> >>> This change allows FFmpeg to benefit from automated SIMD optimizations
> >>> when built with -O3 optimization level, particularly improving
> >>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> >>>
> >>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> >>>
> >>> Version log:
> >>> Only allow GCC versions >= 13 to use auto-vectorization.
> >>> Disscussion see:
> >>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
> >>>
> >>> ---
> >>> configure | 1 -
> >>> 1 file changed, 1 deletion(-)
> >>>
> >>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> >>> ---
> >>> configure | 6 +++++-
> >>> 1 file changed, 5 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/configure b/configure
> >>> index 3730b0524c..91e3e107c2 100755
> >>> --- a/configure
> >>> +++ b/configure
> >>> @@ -7656,7 +7656,11 @@ if enabled icc; then
> >>> disable aligned_stack
> >>> fi
> >>> elif enabled gcc; then
> >>> - check_optflags -fno-tree-vectorize
> >>> + gcc_version=$($cc -dumpversion)
> >>> + major_version=${gcc_version%%.*}
> >>> + if [ $major_version -lt 13 ]; then
> >>> + check_optflags -fno-tree-vectorize
> >>> + fi
> >>> check_cflags -Werror=format-security
> >>> check_cflags -Werror=implicit-function-declaration
> >>> check_cflags -Werror=missing-prototypes
> >>> --
> >>> 2.43.0
> >>>
> >>> This patch modifies the FFmpeg build system to remove the explicit disabling
> >>> of GCC's auto-vectorization feature.
> >>>
> >>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
> >>> through extensive optimizations in loop analysis and SIMD code generation.
> >>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
> >>> (2009) to workaround early GCC vectorization instability is no longer
> >>> necessary for recent gcc versions.
> >>>
> >>> Key improvements justifying this change:
> >>> 1. Enhanced heuristics for loop vectorization cost models
> >>> 2. Mature handling of alignment and memory access patterns
> >>> 3. Robust fallback mechanisms for unsupported architectures
> >>>
> >>> This change allows FFmpeg to benefit from automated SIMD optimizations
> >>> when built with -O3 optimization level, particularly improving
> >>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> >>>
> >>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> >>>
> >>> Version log:
> >>> Only allow GCC versions >= 13 to use auto-vectorization.
> >>> Disscussion see:
> >>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
> >>>
> >>> ---
> >>> configure | 1 -
> >>> 1 file changed, 1 deletion(-)
> >>>
> >>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> >>> ---
> >>> configure | 6 +++++-
> >>> 1 file changed, 5 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/configure b/configure
> >>> index 3730b0524c..91e3e107c2 100755
> >>> --- a/configure
> >>> +++ b/configure
> >>> @@ -7656,7 +7656,11 @@ if enabled icc; then
> >>> disable aligned_stack
> >>> fi
> >>> elif enabled gcc; then
> >>> - check_optflags -fno-tree-vectorize
> >>> + gcc_version=$($cc -dumpversion)
> >>> + major_version=${gcc_version%%.*}
> >>> + if [ $major_version -lt 13 ]; then
> >>> + check_optflags -fno-tree-vectorize
> >>> + fi
> >>> check_cflags -Werror=format-security
> >>> check_cflags -Werror=implicit-function-declaration
> >>> check_cflags -Werror=missing-prototypes
> >>> --
> >>> 2.43.0
> >>>
> >>> This patch modifies the FFmpeg build system to remove the explicit disabling
> >>> of GCC's auto-vectorization feature.
> >>>
> >>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
> >>> through extensive optimizations in loop analysis and SIMD code generation.
> >>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
> >>> (2009) to workaround early GCC vectorization instability is no longer
> >>> necessary for recent gcc versions.
> >>>
> >>> Key improvements justifying this change:
> >>> 1. Enhanced heuristics for loop vectorization cost models
> >>> 2. Mature handling of alignment and memory access patterns
> >>> 3. Robust fallback mechanisms for unsupported architectures
> >>>
> >>> This change allows FFmpeg to benefit from automated SIMD optimizations
> >>> when built with -O3 optimization level, particularly improving
> >>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> >>>
> >>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
> >>>
> >>> Version log:
> >>> Only allow GCC versions >= 13 to use auto-vectorization.
> >>> Disscussion see:
> >>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
> >>>
> >>> ---
> >>> configure | 1 -
> >>> 1 file changed, 1 deletion(-)
> >>>
> >>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
> >>> ---
> >>> configure | 6 +++++-
> >>> 1 file changed, 5 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/configure b/configure
> >>> index 3730b0524c..91e3e107c2 100755
> >>> --- a/configure
> >>> +++ b/configure
> >>> @@ -7656,7 +7656,11 @@ if enabled icc; then
> >>> disable aligned_stack
> >>> fi
> >>> elif enabled gcc; then
> >>> - check_optflags -fno-tree-vectorize
> >>> + gcc_version=$($cc -dumpversion)
> >>> + major_version=${gcc_version%%.*}
> >>> + if [ $major_version -lt 13 ]; then
> >>> + check_optflags -fno-tree-vectorize
> >>> + fi
> >>> check_cflags -Werror=format-security
> >>> check_cflags -Werror=implicit-function-declaration
> >>> check_cflags -Werror=missing-prototypes
> >>> --
> >>> 2.43.0
> >>>
> >>
> >> It looks like the patch format is corrupted.
> >>
> >> I’m OK with the code change. However, the commit message is misleading. As already pointed out
> >> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
> >> they are available at runtime, unless build and run on a particular hardware.
> >
> > can gcc or clang not build code like our runtime cpudetect ?
> > i mean build functions for each major type and detect cpu once
> > and switch accordingly ?
>
> How would this "once" work in practice? If the cpu is supposed to be
> detected only once, the result needs to be stored somewhere (in static
> storage). Even if this is initialized in the libraries .init function
> (so that we can be sure that it is initialized when the actual code is
> run), this would still need a check every time one of these code
> snippets is run.
Iam not a compiler developer, nor an expert in ELF
but the cpu detect could edit the PLT or use an approuch
similar to how our code works, that is function pointers
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 251 bytes --]
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 16:02 ` Michael Niedermayer
2025-05-29 17:26 ` Andreas Rheinhardt
@ 2025-05-30 7:36 ` Rémi Denis-Courmont
1 sibling, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2025-05-30 7:36 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 29 mai 2025 19:02:24 GMT+03:00, Michael Niedermayer <michael@niedermayer.cc> a écrit :
>can gcc or clang not build code like our runtime cpudetect ?
You can, on some versions and some architectures, select the target CPU per function, but you can't select multiple targets, nor have the compiler automatically select "relevant" targets (i.e. those that it can optimise differently).
>i mean build functions for each major type and detect cpu once
>and switch accordingly ?
GCC supports resolving a symbol at runtime, but it requires support from the run-time linker (of course), which is not portable. I'm not sure if anything other than GNU/libc actually supports it.
And it requires a lot of boilerplate (less than FFmpeg's approach but still).
>I cannot be the first person thinking of that
No, indeed - auto vectorisation has been a thing for twenty years. But, either nobody has cared to fund that runtime detection bit, or compiler developers somehow block that work on some technical basis. My uninformed guess is the former.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation.
2025-05-29 8:37 ` Zhao Zhili
2025-05-29 10:20 ` Jiawei
2025-05-29 16:02 ` Michael Niedermayer
@ 2025-06-12 9:05 ` Martin Storsjö
2 siblings, 0 replies; 10+ messages in thread
From: Martin Storsjö @ 2025-06-12 9:05 UTC (permalink / raw)
To: FFmpeg development discussions and patches; +Cc: Jiawei
On Thu, 29 May 2025, Zhao Zhili wrote:
>> On May 29, 2025, at 15:03, Jiawei <jiawei@iscas.ac.cn> wrote:
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions have demonstrated stable auto-vectorization capabilities
>> through extensive optimizations in loop analysis and SIMD code generation.
>> The explicit -fno-tree-vectorize flag originally added in commit 973859f
>> (2009) to workaround early GCC vectorization instability is no longer
>> necessary for recent gcc versions.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> Version log:
>> Only allow GCC versions >= 13 to use auto-vectorization.
>> Disscussion see:
>> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20250521061750.54882-1-jiawei@iscas.ac.cn/
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> Signed-off-by: Jiawei <jiawei@iscas.ac.cn>
>> ---
>> configure | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..91e3e107c2 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,11 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> + gcc_version=$($cc -dumpversion)
>> + major_version=${gcc_version%%.*}
>> + if [ $major_version -lt 13 ]; then
>> + check_optflags -fno-tree-vectorize
>> + fi
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>
> It looks like the patch format is corrupted.
>
> I’m OK with the code change. However, the commit message is misleading. As already pointed out
> by multiple developers, this option doesn’t help with AVX, SVE and RVV because we can’t assume
> they are available at runtime, unless build and run on a particular hardware.
I'm also ok with the code change in itself, but I would also prefer not to
advertise or motivate the change with non-default instruction sets like
AVX, SVE and RVV. (For instruction sets in the base architecture sets,
like NEON, it can be useful though.)
It would also be good to mention previous attempts to do the same, which
was done in 2016 in cb8646af24bd8e9627cc5e1c62b049a00fe0b07b and reverted
in fd6dbc53855fbfc9a782095d0ffe11dd3a98905f. The issues that were noticed
at that point were relating to the complicated inline x86 cabac assembly,
which nearly exhausts all available registers. In
182663a58a7a099e02e76da3b0f96d63e5c26a6d (in 2023) this function was made
non-inline, so the issues with exhausting registers shouldn't affect other
functions so much. So this should be essential to mention, as to why we
hope this attempt will work better this time, compared to last time.
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-06-12 9:05 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-29 7:03 [FFmpeg-devel] [FFmpeg-devel, v2] gcc: Relaxing auto-vectorization limitation Jiawei
2025-05-29 8:37 ` Zhao Zhili
2025-05-29 10:20 ` Jiawei
2025-05-29 10:53 ` Zhao Zhili
2025-05-29 13:35 ` Frank Plowman
2025-05-29 16:02 ` Michael Niedermayer
2025-05-29 17:26 ` Andreas Rheinhardt
2025-05-29 19:06 ` Michael Niedermayer
2025-05-30 7:36 ` Rémi Denis-Courmont
2025-06-12 9:05 ` Martin Storsjö
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git