在 2023/5/28 12:36, Jean-Baptiste Kempf 写道: > Hello, > > The last interaction still has the wrong name in patchset. Thanks for reminding.  I modified the correct name in git. > > jb > > On Sun, 28 May 2023, at 12:23, Logan.Lyu wrote: >> Hi, Martin >> >> I have finished the modification, please review again. >> >> Thanks. >> >> >> 在 2023/5/26 16:34, Martin Storsjö 写道: >>> Hi, >>> >>> Overall these patches seem mostly ok, but I've got a few minor points >>> to make: >>> >>> - The usdot instruction requires the i8mm extension (part of >>> armv8.6-a), while udot or sdot would require the dotprod extension >>> (available in armv8.4-a). If you could manage with udot or sdot, these >>> functions would be usable on a wider set of CPUs. >>> >>> Therefore, the current guards are wrong. Also, I finally got support >>> implemented for optionally using these cpu extensions, even if the >>> baseline of the compile don't include it, by runtime enabling it. See >>> the patchset at >>> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=9009. >>> >>> To adapt your patches on top of this, see the two topmost commits at >>> https://github.com/mstorsjo/ffmpeg/commits/archext. >> Fixed. >>> - The indentation is inconsistent; in the first patch, you have some >>> instructions written like this: >>> >>> +        sqadd   v1.4s, v1.4s, v29.4s >>> >>> While you later use this style: >>> >>> +        dup             v1.16b, v28.b[1] >>> >>> The latter seems to match the style we commonly use; please reformat >>> your code to match that consistently. >>> >>> With some macro invocations in the first patch, you also seem to have >>> too much indentation in some places. See e.g. this: >>> >>> +1:      ldr             q23, [x2, x3] >>> +        add             x2, x2, x3, lsl #1 >>> +        QPEL_FILTER_B     v26, v16, v17, v18, v19, v20, v21, v22, v23 >>> +        QPEL_FILTER_B2    v27, v16, v17, v18, v19, v20, v21, v22, v23 >>> +        QPEL_UNI_W_V_16 >>> +        subs            w4, w4, #1 >>> +        b.eq            2f >>> >>> (If the macro name is too long, that's ok, but here there's no need to >>> have those lines unaligned.) >> Fixed. >>> - In the third patch, you've got multiple parameters from the stack >>> like this: >>> >>> +        ldp             x14, x15, [sp]          // mx, my >>> +        ldr             w13, [sp, #16]          // width >>> >>> I see that the mx an my parameters are intptr_t; that's good, since if >>> they would be 32 bit integers, the ABI for such parameters on the >>> stack differ between macOS/Darwin and Linux. But as long as they're >>> intptr_t they behave the same. >>> >>> - At the same place, you're backing up a bunch of registers: >>> >>> +        stp             x20, x21, [sp, #-16]! >>> +        stp             x22, x23, [sp, #-16]! >>> +        stp             x24, x25, [sp, #-16]! >>> +        stp             x26, x27, [sp, #-16]! >>> +        stp             x28, x30, [sp, #-16]! >>> >>> This is inefficient; instead, do this: >>> >>> +        stp             x28, x30, [sp, #-80]! >>> +        stp             x20, x21, [sp, #16] >>> +        stp             x22, x23, [sp, #32] >>> +        stp             x24, x25, [sp, #48] >>> +        stp             x26, x27, [sp, #64] >>> >>> Also, following that, I see that you back up the stack pointer in x28. >>> Why do you use x28 for that? Using x29 would be customary as frame >>> pointer. >> Using more efficient implementation now.  And x28 in this case is a >> common callee save register. Anyway, I use x19 instead now. >>> Aside for that, I think the rest of the patches is acceptable. >>> >>> // Martin >>> >>> _______________________________________________ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >>> >>> To unsubscribe, visit link above, or email >>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> >> Attachments: >> * 0002-lavc-aarch64-new-optimization-for-8-bit-hevc_qpel_un.patch >> * 0003-lavc-aarch64-new-optimization-for-8-bit-hevc_qpel_h-.patch >> * 0001-lavc-aarch64-new-optimization-for-8-bit-hevc_pel_uni.patch