Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
@ 2025-05-21 10:14 Jiawei
  0 siblings, 0 replies; 28+ messages in thread
From: Jiawei @ 2025-05-21 10:14 UTC (permalink / raw)
  To: ffmpeg-devel

> > -----原始邮件-----
> > 发件人: "Nicolas George" <george@nsup.org>
> > 发送时间: 2025-05-21 14:52:12 (星期三)
> > 收件人: "FFmpeg development discussions and patches" 
> <ffmpeg-devel@ffmpeg.org>
> > 抄送:
> > 主题: Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
> >
> > Jiawei (HE12025-05-21):
> > >                      particularly improving
> > > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) 
> architectures.
> >
> > Benchmark needed.
> >
> > Regards,
> >
> > --
> >   Nicolas George

Hi Nicolas,


Since I am a gcc developer, I'm not so familiar with the FFmpeg test 
flow, here is my test process,
if there exists anything uncorrect, please point me out:


1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip 
<https://download.blender.org/demo/movies/BBB/bbb_sunflower_2160p_30fps_normal.mp4.zip> 
from https://download.blender.org/demo/movies/BBB/,

```

ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf 
"scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4
```

get the 1080p video as Benchmark test video


2. Build two version of FFmpeg, one with the modify,  another without 
the patch modif, using the gcc 13.3 release version,

verified with Intel(R) Core(TM) Ultra 9 285HX


Using patch:

```
./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers
   built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
   configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags=-O3 --enable-static --target-os=linux
   libavutil      60.  2.100 / 60.  2.100
   libavcodec     62.  3.101 / 62.  3.101
   libavformat    62.  0.102 / 62.  0.102
   libavdevice    62.  0.100 / 62.  0.100
   libavfilter    11.  0.100 / 11.  0.100
   libswscale      9.  0.100 /  9.  0.100
   libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf62.0.102
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       encoder         : Lavc62.3.101 wrapped_avframe
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       encoder         : Lavc62.3.101 pcm_s16le
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
[out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=21.2x elapsed=0:00:02.83
bench: utime=11.324s stime=0.290s rtime=2.834s
bench: maxrss=186556KiB
```

Without patch(here I add the fno-tree-vectorize directly):

./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers
   built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
   configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux
   libavutil      60.  2.100 / 60.  2.100
   libavcodec     62.  3.101 / 62.  3.101
   libavformat    62.  0.102 / 62.  0.102
   libavdevice    62.  0.100 / 62.  0.100
   libavfilter    11.  0.100 / 11.  0.100
   libswscale      9.  0.100 /  9.  0.100
   libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf62.0.102
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       encoder         : Lavc62.3.101 wrapped_avframe
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       encoder         : Lavc62.3.101 pcm_s16le
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
[out#0/null @ 0x55eb196b7eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=509 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=  17x elapsed=0:00:03.53
bench: utime=21.544s stime=0.349s rtime=3.536s
bench: maxrss=181580KiB

And I also tested on a RISC-V develop board MUSE Pi Pro, Here following 
is the configure and result:

Using patch:

root@spacemit-k1-x-MUSE-Pi-Pro-board:~# ./ffpv/bin/ffmpeg -benchmark -i 
1080p_hevc_mp3.mp4 -f null -
ffmpeg version n6.1.2 Copyright (c) 2000-2024 the FFmpeg developers
   built with gcc 16.0.0 (g3fc902e738b) 20250519 (experimental)
   configuration: --prefix=/home/pz9115/ffpv --disable-ffplay 
--arch=riscv --extra-cflags='-march=rv64gcv_zba_zbb_zbs -O3 -ffast-math' 
--cross-prefix=/home/pz9115/rvv/bin/riscv64-unknown-linux-gnu- 
--cc=/home/pz9115/rvv/bin/riscv64-unknown-linux-gnu-gcc 
--cxx=/home/pz9115/rvv/bin/riscv64-unknown-linux-gnu-g++ --enable-static 
--enable-cross-compile --target-os=linux --disable-rvv
   libavutil      58. 29.100 / 58. 29.100
   libavcodec     60. 31.102 / 60. 31.102
   libavformat    60. 16.100 / 60. 16.100
   libavdevice    60.  3.100 / 60.  3.100
   libavfilter     9. 12.100 /  9. 12.100
   libswscale      7.  5.100 /  7.  5.100
   libswresample   4. 12.100 /  4. 12.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, 
stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Odd rotation angle.
If you want to help, upload a sample of this file to 
https://streams.videolan.org/upload/ and contact the ffmpeg-devel 
mailing list. (ffmpeg-devel@ffmpeg.org)Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf60.16.100
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1080x1920 [SAR 1:1 DAR 9:16], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 wrapped_avframe
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 pcm_s16le
[out#0/null @ 0x28a82e0] video:844kB audio:11250kB subtitle:0kB other 
streams:0kB global headers:0kB muxing overhead: unknown
frame= 1800 fps= 42 q=-0.0 Lsize=N/A time=00:00:59.97 bitrate=N/A 
speed=1.41x
bench: utime=207.150s stime=5.319s rtime=42.608s
bench: maxrss=162160kB

Without patch(same added the fno-tree-vectorize directly):

./ffp/bin/ffmpeg -benchmark -i 1080p_hevc_mp3.mp4 -f null -
ffmpeg version n6.1.2 Copyright (c) 2000-2024 the FFmpeg developers
   built with gcc 16.0.0 (g38163c874a3-dirty) 20250515 (experimental)
   configuration: --prefix=/home/pz9115/ffp --disable-ffplay 
--arch=riscv --sysroot=/home/pz9115/rv/sysroot 
--extra-cflags='-march=rv64gcv_zba_zbb_zbc_zbs_zca_zcd -mabi=lp64d -O3 
-fno-tree-vectorize -static' --extra-ldflags=-static 
--cross-prefix=/home/pz9115/rv/bin/riscv64-unknown-linux-gnu- 
--enable-static --enable-cross-compile --target-os=linux --disable-rvv
   libavutil      58. 29.100 / 58. 29.100
   libavcodec     60. 31.102 / 60. 31.102
   libavformat    60. 16.100 / 60. 16.100
   libavdevice    60.  3.100 / 60.  3.100
   libavfilter     9. 12.100 /  9. 12.100
   libswscale      7.  5.100 /  7.  5.100
   libswresample   4. 12.100 /  4. 12.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, 
stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf60.16.100
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 wrapped_avframe
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 pcm_s16le
[out#0/null @ 0x2729630] video:844kB audio:11250kB subtitle:0kB other 
streams:0kB global headers:0kB muxing overhead: unknown
frame= 1800 fps= 30 q=-0.0 Lsize=N/A time=00:00:59.97 bitrate=N/A 
speed=   1x
bench: utime=321.145s stime=2.475s rtime=59.960s
bench: maxrss=131532kB


> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 28+ messages in thread
* Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
@ 2025-05-21 10:08 Jiawei
  0 siblings, 0 replies; 28+ messages in thread
From: Jiawei @ 2025-05-21 10:08 UTC (permalink / raw)
  To: george; +Cc: ffmpeg-devel

> > -----原始邮件-----
> > 发件人: "Nicolas George" <george@nsup.org>
> > 发送时间: 2025-05-21 14:52:12 (星期三)
> > 收件人: "FFmpeg development discussions and patches" 
> <ffmpeg-devel@ffmpeg.org>
> > 抄送:
> > 主题: Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
> >
> > Jiawei (HE12025-05-21):
> > >                      particularly improving
> > > performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) 
> architectures.
> >
> > Benchmark needed.
> >
> > Regards,
> >
> > --
> >   Nicolas George
>
Hi Nicolas,


Since I am a gcc developer, I'm not so familiar with the FFmpeg test 
flow, here is my test process,
if there exists anything uncorrect, please point me out:


1. Download the video bbb_sunflower_2160p_30fps_normal.mp4.zip 
<https://download.blender.org/demo/movies/BBB/bbb_sunflower_2160p_30fps_normal.mp4.zip> 
from https://download.blender.org/demo/movies/BBB/,

```

ffmpeg -i bbb_sunflower_2160p_30fps_normal.mp4 -t 60 -vf 
"scale=1920:1080" -c:v libx265 -c:a libmp3lame 1080p_hevc_mp3.mp4
```

get the 1080p video as Benchmark test video


2. Build two version of FFmpeg, one with the modify,  another without 
the patch modif, using the gcc 13.3 release version,

verified with Intel(R) Core(TM) Ultra 9 285HX


Using patch:

```
./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers
   built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
   configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags=-O3 --enable-static --target-os=linux
   libavutil      60.  2.100 / 60.  2.100
   libavcodec     62.  3.101 / 62.  3.101
   libavformat    62.  0.102 / 62.  0.102
   libavdevice    62.  0.100 / 62.  0.100
   libavfilter    11.  0.100 / 11.  0.100
   libswscale      9.  0.100 /  9.  0.100
   libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf62.0.102
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       encoder         : Lavc62.3.101 wrapped_avframe
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       encoder         : Lavc62.3.101 pcm_s16le
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
[out#0/null @ 0x565233669eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=635 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=21.2x elapsed=0:00:02.83
bench: utime=11.324s stime=0.290s rtime=2.834s
bench: maxrss=186556KiB
```

Without patch(here I add the fno-tree-vectorize directly):

./ffmpeg -benchmark -i ~/mp/1080p_hevc_mp3.mp4 -f null -
ffmpeg version N-119636-g96518c8d8d Copyright (c) 2000-2025 the FFmpeg 
developers
   built with gcc 13 (Ubuntu 13.3.0-6ubuntu2~24.04)
   configuration: --prefix=/home/pz9115/ffpo --disable-ffplay --arch=x64 
--extra-cflags='-O3 -fno-tree-vectorize' --enable-static --target-os=linux
   libavutil      60.  2.100 / 60.  2.100
   libavcodec     62.  3.101 / 62.  3.101
   libavformat    62.  0.102 / 62.  0.102
   libavdevice    62.  0.100 / 62.  0.100
   libavfilter    11.  0.100 / 11.  0.100
   libswscale      9.  0.100 /  9.  0.100
   libswresample   6.  0.100 /  6.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 
'/home/pz9115/mp/1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp3float) (mp4a / 0x6134706D), 
48000 Hz, stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf62.0.102
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       encoder         : Lavc62.3.101 wrapped_avframe
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       encoder         : Lavc62.3.101 pcm_s16le
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
[out#0/null @ 0x55eb196b7eb0] video:731KiB audio:11250KiB subtitle:0KiB 
other streams:0KiB global headers:0KiB muxing overhead: unknown
frame= 1800 fps=509 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A 
speed=  17x elapsed=0:00:03.53
bench: utime=21.544s stime=0.349s rtime=3.536s
bench: maxrss=181580KiB

And I also tested on a RISC-V develop board MUSE Pi Pro, Here following 
is the configure and result:

Using patch:

root@spacemit-k1-x-MUSE-Pi-Pro-board:~# ./ffpv/bin/ffmpeg -benchmark -i 
1080p_hevc_mp3.mp4 -f null -
ffmpeg version n6.1.2 Copyright (c) 2000-2024 the FFmpeg developers
   built with gcc 16.0.0 (g3fc902e738b) 20250519 (experimental)
   configuration: --prefix=/home/pz9115/ffpv --disable-ffplay 
--arch=riscv --extra-cflags='-march=rv64gcv_zba_zbb_zbs -O3 -ffast-math' 
--cross-prefix=/home/pz9115/rvv/bin/riscv64-unknown-linux-gnu- 
--cc=/home/pz9115/rvv/bin/riscv64-unknown-linux-gnu-gcc 
--cxx=/home/pz9115/rvv/bin/riscv64-unknown-linux-gnu-g++ --enable-static 
--enable-cross-compile --target-os=linux --disable-rvv
   libavutil      58. 29.100 / 58. 29.100
   libavcodec     60. 31.102 / 60. 31.102
   libavformat    60. 16.100 / 60. 16.100
   libavdevice    60.  3.100 / 60.  3.100
   libavfilter     9. 12.100 /  9. 12.100
   libswscale      7.  5.100 /  7.  5.100
   libswresample   4. 12.100 /  4. 12.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, 
stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Odd rotation angle.
If you want to help, upload a sample of this file to 
https://streams.videolan.org/upload/ and contact the ffmpeg-devel 
mailing list. (ffmpeg-devel@ffmpeg.org)Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf60.16.100
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1080x1920 [SAR 1:1 DAR 9:16], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 wrapped_avframe
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 pcm_s16le
[out#0/null @ 0x28a82e0] video:844kB audio:11250kB subtitle:0kB other 
streams:0kB global headers:0kB muxing overhead: unknown
frame= 1800 fps= 42 q=-0.0 Lsize=N/A time=00:00:59.97 bitrate=N/A 
speed=1.41x
bench: utime=207.150s stime=5.319s rtime=42.608s
bench: maxrss=162160kB

Without patch(same added the fno-tree-vectorize directly):

./ffp/bin/ffmpeg -benchmark -i 1080p_hevc_mp3.mp4 -f null -
ffmpeg version n6.1.2 Copyright (c) 2000-2024 the FFmpeg developers
   built with gcc 16.0.0 (g38163c874a3-dirty) 20250515 (experimental)
   configuration: --prefix=/home/pz9115/ffp --disable-ffplay 
--arch=riscv --sysroot=/home/pz9115/rv/sysroot 
--extra-cflags='-march=rv64gcv_zba_zbb_zbc_zbs_zca_zcd -mabi=lp64d -O3 
-fno-tree-vectorize -static' --extra-ldflags=-static 
--cross-prefix=/home/pz9115/rv/bin/riscv64-unknown-linux-gnu- 
--enable-static --enable-cross-compile --target-os=linux --disable-rvv
   libavutil      58. 29.100 / 58. 29.100
   libavcodec     60. 31.102 / 60. 31.102
   libavformat    60. 16.100 / 60. 16.100
   libavdevice    60.  3.100 / 60.  3.100
   libavfilter     9. 12.100 /  9. 12.100
   libswscale      7.  5.100 /  7.  5.100
   libswresample   4. 12.100 /  4. 12.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '1080p_hevc_mp3.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     encoder         : Lavf60.16.100
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     genre           : Animation
   Duration: 00:01:00.00, start: 0.000000, bitrate: 1564 kb/s
   Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), 
yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 1429 kb/s, 30 
fps, 30 tbr, 15360 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 libx265
   Stream #0:1[0x2](und): Audio: mp3 (mp4a / 0x6134706D), 48000 Hz, 
stereo, fltp, 128 kb/s (default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
Stream mapping:
   Stream #0:0 -> #0:0 (hevc (native) -> wrapped_avframe (native))
   Stream #0:1 -> #0:1 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     major_brand     : isom
     minor_version   : 512
     compatible_brands: isomiso2mp41
     title           : Big Buck Bunny, Sunflower version
     artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
     composer        : Sacha Goedegebure
     genre           : Animation
     comment         : Creative Commons Attribution 3.0 - 
http://bbb3d.renderfarming.net
     encoder         : Lavf60.16.100
   Stream #0:0(und): Video: wrapped_avframe, yuv420p(tv, progressive), 
1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
     Metadata:
       handler_name    : GPAC ISO Video Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 wrapped_avframe
   Stream #0:1(und): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s 
(default)
     Metadata:
       handler_name    : GPAC ISO Audio Handler
       vendor_id       : [0][0][0][0]
       encoder         : Lavc60.31.102 pcm_s16le
[out#0/null @ 0x2729630] video:844kB audio:11250kB subtitle:0kB other 
streams:0kB global headers:0kB muxing overhead: unknown
frame= 1800 fps= 30 q=-0.0 Lsize=N/A time=00:00:59.97 bitrate=N/A 
speed=   1x
bench: utime=321.145s stime=2.475s rtime=59.960s
bench: maxrss=131532kB


> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 28+ messages in thread
* [FFmpeg-devel] gcc: Remove auto-vectorization limitation.
@ 2025-05-21  6:17 Jiawei
  2025-05-21  6:52 ` Nicolas George
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Jiawei @ 2025-05-21  6:17 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Jiawei

This patch modifies the FFmpeg build system to remove the explicit disabling
of GCC's auto-vectorization feature.

Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
capabilities through extensive optimizations in loop analysis and SIMD
code generation. The explicit -fno-tree-vectorize flag originally added
in commit 973859f (2009) to workaround early GCC vectorization instability
is no longer necessary.

Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures

This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.

[1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191

---
 configure | 1 -
 1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 3730b0524c..b9e95ce4ec 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,6 @@ if enabled icc; then
             disable aligned_stack
     fi
 elif enabled gcc; then
-    check_optflags -fno-tree-vectorize
     check_cflags -Werror=format-security
     check_cflags -Werror=implicit-function-declaration
     check_cflags -Werror=missing-prototypes
-- 
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-06-04 11:13 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-21 10:14 [FFmpeg-devel] gcc: Remove auto-vectorization limitation Jiawei
  -- strict thread matches above, loose matches on Subject: below --
2025-05-21 10:08 Jiawei
2025-05-21  6:17 Jiawei
2025-05-21  6:52 ` Nicolas George
2025-05-21 10:17   ` Jiawei
2025-05-21 18:21     ` Frank Plowman
2025-05-22  6:32       ` Jiawei
2025-05-24  1:46         ` Kieran Kunhya via ffmpeg-devel
2025-05-24  4:10           ` Jiawei
2025-05-24 16:10         ` Rémi Denis-Courmont
2025-05-25 21:37           ` Michael Niedermayer
2025-05-26  8:43             ` Rémi Denis-Courmont
2025-05-30  0:46               ` Michael Niedermayer
2025-05-30  6:58                 ` Rémi Denis-Courmont
2025-05-31 13:39                   ` Michael Niedermayer
2025-06-03 16:14                   ` Niklas Haas
2025-06-04 11:13                     ` Rémi Denis-Courmont
2025-05-21  7:46 ` Michael Niedermayer
2025-05-21 10:32   ` Jiawei
2025-05-21 11:09     ` Michael Niedermayer
2025-05-21  9:04 ` Zhao Zhili
2025-05-21 10:26   ` Jiawei
2025-05-21 10:33 ` Andreas Rheinhardt
2025-05-21 12:09   ` Martin Storsjö
2025-05-21 12:14     ` Andreas Rheinhardt
2025-05-21 12:22       ` Martin Storsjö
2025-05-21 18:12         ` softworkz .
2025-05-24 12:00 ` Rémi Denis-Courmont

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git