Hi Martin, I updated these patches based on your comments, please help to review it again.  And My reply is as follows : 在 2023/5/4 16:49, Martin Storsjö 写道: > On Wed, 3 May 2023, myais wrote: > >>> Hello, >>> >>> - I splited this patch, Do I need to resubmit or just attach them as >>> attachments? (I attached those patches.  If I need to resummit, >>> please let me know.) >> >> The attached form here is fine with me. >> >> I didn't review it in detail yet, but I think it roughly looks ok, >> but there's a few issues. >> >> The "fate-hevc" tests as part of our testsuite fails with these >> patches applied - please test that and make sure it passes. I'm having some trouble downloading fate-suite using rsync, and I can't download it now, It may be caused by network problems. but I tested the decoding of hevc (h265) files myself, and it can be decoded into yuv files normally, and I will continue to try to download fate-suite and test it soon. >> >> The assembly fails to build with MSVC, with errors like these: >> >> libavcodec\aarch64\hevcdsp_qpel_neon.o.asm(2278) : error A2173: >> syntax error in expression >>         add     x2, x2, x3, lsl 1 >> >> The immediate constant 1 here should be prefixed with #, the same >> thing goes in a lot of other places in the same file. - immediate constant prefixed with # ---fixed. >> >> If compiling only patch 1, it fails to build due to a mismatched >> #endif in the c file. - mismatched #endif ---fixed. >> >> The second patch adds code that git flags as "No newline at end of >> file"; please don't do that, please make sure your editor saves the >> file as usual with a trailing newline. - trailing newline --- fixed. >> >> The patches have some cases of trailing whitespace, please make sure >> you don't have time. - trailing whitespace ---fixed. >> >> In the second patch, you're inconsistently using "#if >> __ARM_FEATURE_DOTPROD" and "#if defined(__ARM_FEATURE_DOTPROD)". -  "#if defined(__ARM_FEATURE_DOTPROD)" is now uniformly used. >> >> Dot product is a new feature we haven't taken advantage of on aarch64 >> before. None of my toolchains/environments have this enabled by >> default. It would be good if you'd provide examples of where you're >> testing it and how you configure the build to enable it. Dot product is an Optional instruntion from Armv8.2 to Armv8.5, and from armv8.6 it is mandatory for implementations. so it can be enable by add flags "-march=armv8.6". >> >> Because right now, most of this assembly runs untested unless >> building in a specific configuration that explicitly enables that >> extension. >> >> For such features, we generally would want to always compile/assemble >> the feature (as long as the toolchain supports assembling it, with >> some extra flag), and use runtime checks for detecting whether the >> cpu feature is supported. But I guess adding support for that is a >> bigger separate project, and this approach with build time ifdefs if >> the toolchain mandates support for it, is tolerable for now. >> >> If there's not a huge benefit from the dot product instructions, >> maybe it would be best to just not use them, so the whole wide >> audience can benefit from the optimizations? Yes, for calculations in the horizontal direction, the dot product command can bring a relatively large improvement, compared with the MLA instructions. Perhaps in the future,  a more general version can be implemented using the mla instruction. > > > // Martin > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".