在 2023/5/28 12:36, Jean-Baptiste Kempf 写道:
> Hello,
>
> The last interaction still has the wrong name in patchset.
Thanks for reminding.  I modified the correct name in git.
>
> jb
>
> On Sun, 28 May 2023, at 12:23, Logan.Lyu wrote:
>> Hi, Martin
>>
>> I have finished the modification, please review again.
>>
>> Thanks.
>>
>>
>> 在 2023/5/26 16:34, Martin Storsjö 写道:
>>> Hi,
>>>
>>> Overall these patches seem mostly ok, but I've got a few minor points
>>> to make:
>>>
>>> - The usdot instruction requires the i8mm extension (part of
>>> armv8.6-a), while udot or sdot would require the dotprod extension
>>> (available in armv8.4-a). If you could manage with udot or sdot, these
>>> functions would be usable on a wider set of CPUs.
>>>
>>> Therefore, the current guards are wrong. Also, I finally got support
>>> implemented for optionally using these cpu extensions, even if the
>>> baseline of the compile don't include it, by runtime enabling it. See
>>> the patchset at
>>> https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=9009.
>>>
>>> To adapt your patches on top of this, see the two topmost commits at
>>> https://github.com/mstorsjo/ffmpeg/commits/archext.
>> Fixed.
>>> - The indentation is inconsistent; in the first patch, you have some
>>> instructions written like this:
>>>
>>> +        sqadd   v1.4s, v1.4s, v29.4s
>>>
>>> While you later use this style:
>>>
>>> +        dup             v1.16b, v28.b[1]
>>>
>>> The latter seems to match the style we commonly use; please reformat
>>> your code to match that consistently.
>>>
>>> With some macro invocations in the first patch, you also seem to have
>>> too much indentation in some places. See e.g. this:
>>>
>>> +1:      ldr             q23, [x2, x3]
>>> +        add             x2, x2, x3, lsl #1
>>> +        QPEL_FILTER_B     v26, v16, v17, v18, v19, v20, v21, v22, v23
>>> +        QPEL_FILTER_B2    v27, v16, v17, v18, v19, v20, v21, v22, v23
>>> +        QPEL_UNI_W_V_16
>>> +        subs            w4, w4, #1
>>> +        b.eq            2f
>>>
>>> (If the macro name is too long, that's ok, but here there's no need to
>>> have those lines unaligned.)
>> Fixed.
>>> - In the third patch, you've got multiple parameters from the stack
>>> like this:
>>>
>>> +        ldp             x14, x15, [sp]          // mx, my
>>> +        ldr             w13, [sp, #16]          // width
>>>
>>> I see that the mx an my parameters are intptr_t; that's good, since if
>>> they would be 32 bit integers, the ABI for such parameters on the
>>> stack differ between macOS/Darwin and Linux. But as long as they're
>>> intptr_t they behave the same.
>>>
>>> - At the same place, you're backing up a bunch of registers:
>>>
>>> +        stp             x20, x21, [sp, #-16]!
>>> +        stp             x22, x23, [sp, #-16]!
>>> +        stp             x24, x25, [sp, #-16]!
>>> +        stp             x26, x27, [sp, #-16]!
>>> +        stp             x28, x30, [sp, #-16]!
>>>
>>> This is inefficient; instead, do this:
>>>
>>> +        stp             x28, x30, [sp, #-80]!
>>> +        stp             x20, x21, [sp, #16]
>>> +        stp             x22, x23, [sp, #32]
>>> +        stp             x24, x25, [sp, #48]
>>> +        stp             x26, x27, [sp, #64]
>>>
>>> Also, following that, I see that you back up the stack pointer in x28.
>>> Why do you use x28 for that? Using x29 would be customary as frame
>>> pointer.
>> Using more efficient implementation now.  And x28 in this case is a
>> common callee save register. Anyway, I use x19 instead now.
>>> Aside for that, I think the rest of the patches is acceptable.
>>>
>>> // Martin
>>>
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>>> To unsubscribe, visit link above, or email
>>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>> Attachments:
>> * 0002-lavc-aarch64-new-optimization-for-8-bit-hevc_qpel_un.patch
>> * 0003-lavc-aarch64-new-optimization-for-8-bit-hevc_qpel_h-.patch
>> * 0001-lavc-aarch64-new-optimization-for-8-bit-hevc_pel_uni.patch