From: Zhao Zhili <quinkblack@foxmail.com> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v2 1/2] avfilter/vf_blackdetect: add AVX2 SIMD version Date: Fri, 18 Jul 2025 20:26:19 +0800 Message-ID: <tencent_29E1D45A12340980F435A93EC556A59D8E09@qq.com> (raw) In-Reply-To: <CABPLASSRQAWfrBNBXo3R6HzZkvq++HQ7b9DQnGbQFVnzOSgPZQ@mail.gmail.com> > On Jul 18, 2025, at 19:36, Kacper Michajlow <kasper93@gmail.com> wrote: > > On Thu, 17 Jul 2025 at 12:45, Niklas Haas <ffmpeg@haasn.xyz> wrote: >> >> From: Niklas Haas <git@haasn.dev> >> >> Requested by a user. Even with autovectorization enabled, the compiler >> performs a quite poor job of optimizing this function, due to not being >> able to take advantage of the pmaxub + pcmpeqb trick for counting the number >> of pixels less than or equal-to a threshold. >> >> blackdetect8_c: 4625.0 ( 1.00x) >> blackdetect8_avx2: 155.1 (29.83x) >> blackdetect16_c: 2529.4 ( 1.00x) >> blackdetect16_avx2: 163.6 (15.46x) > > I think we should try to have better standards for reporting > performance metrics. Those numbers without context mean not so much. > What compiler, flags, cpu were used? Sure, we can omit some > information if we want to show only the scaling, but if it highly > depends on those things, then we should at least try to be more > specific. > > Sorry for being pedantic about those things, but I think it's > important, especially if we put those values in a commit message which > will live forever in the repository as a vague reference. The basic benchmark context can be specified in doc and leave out in commits. And it’s easy to get the implicit conditions in the mailing list. I’m more worried about only give the data on social media platforms, e.g., X. The checkasm benchmark serve our purpose very well, but it’s not a fair benchmark to compare hand written assembly to compiler optimizations. Sure we can do better than compiler, but compiler isn’t that sucks. More importantly, compiler isn’t our enemy, no point in embarrassing the compiler or compiler developer. "If I have seen further it is by standing on the shoulders of Giants.” — By Isaac Newton. Runtime cpu detection is a thing. There are methods to use compiler auto-vectorize and runtime cpu detection at the same time. Our approach isn’t the only working approach. > >> Even with autovectorization enabled > > You mention the auto vectorization enabled, yet the reported numbers > are without it. In my mind this description implies that shown > performance comparison is with auto vectorization enabled. > > When we compare apples to apples, with avx2 we get a more expectable > 3.74x (gcc) / 2.38x (clang) depending on the compiler. It's still a > good improvement, no reason to oversell it. > > For reference some metrics on me end: > > clang 20.1.7 > > march=generic (default config): > > blackdetect8_c: 1591.1 ( 1.00x) > blackdetect8_avx2: 225.1 ( 7.07x) > blackdetect16_c: 643.5 ( 1.00x) > blackdetect16_avx2: 220.6 ( 2.92x) > > march=core-avx2: > > blackdetect8_c: 526.0 ( 1.00x) > blackdetect8_avx2: 220.9 ( 2.38x) > blackdetect16_c: 318.8 ( 1.00x) > blackdetect16_avx2: 225.9 ( 1.41x) > > gcc 14.2.0 > > -fno-tree-vectorize (default config): > > blackdetect8_c: 5126.6 ( 1.00x) > blackdetect8_avx2: 198.0 (25.89x) > blackdetect16_c: 2151.9 ( 1.00x) > blackdetect16_avx2: 196.8 (10.93x) > > march=generic -ftree-vectorize: > > blackdetect8_c: 1354.4 ( 1.00x) > blackdetect8_avx2: 196.9 ( 6.88x) > blackdetect16_c: 644.2 ( 1.00x) > blackdetect16_avx2: 249.8 ( 2.58x) > > march=core-avx2 -ftree-vectorize: > > blackdetect8_c: 820.8 ( 1.00x) > blackdetect8_avx2: 219.2 ( 3.74x) > blackdetect16_c: 372.8 ( 1.00x) > blackdetect16_avx2: 201.4 ( 1.85x) > > Again, sorry for being pedantic here, but it gives the wrong > impression especially if you look at this from outside. > > - Kacper > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2025-07-18 12:26 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-07-17 10:45 Niklas Haas 2025-07-17 10:45 ` [FFmpeg-devel] [PATCH v2 2/2] tests/checkasm: add test for vf_blackdetect Niklas Haas 2025-07-18 11:35 ` [FFmpeg-devel] [PATCH v2 1/2] avfilter/vf_blackdetect: add AVX2 SIMD version Zhao Zhili 2025-07-18 11:36 ` Kacper Michajlow 2025-07-18 12:14 ` Kieran Kunhya via ffmpeg-devel 2025-07-18 12:28 ` Kacper Michajlow 2025-07-18 12:41 ` Kacper Michajlow 2025-07-18 12:46 ` Kieran Kunhya via ffmpeg-devel 2025-07-18 13:21 ` Kacper Michajlow 2025-07-18 13:33 ` Kieran Kunhya via ffmpeg-devel 2025-07-18 14:16 ` Kacper Michajlow 2025-07-18 14:36 ` Kieran Kunhya via ffmpeg-devel 2025-07-18 12:26 ` Zhao Zhili [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=tencent_29E1D45A12340980F435A93EC556A59D8E09@qq.com \ --to=quinkblack@foxmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git