From: "Martin Storsjö" <martin@martin.st> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH] lavc/aarch64: new optimization for 8-bit hevc_pel_uni_w_pixels, qpel_uni_w_h, qpel_uni_w_v, qpel_uni_w_hv and qpel_h Date: Fri, 26 May 2023 11:34:01 +0300 (EEST) Message-ID: <4badda67-2eb-e262-f791-9d2847dbd71@martin.st> (raw) In-Reply-To: <e3a3867a-7f85-9001-c93e-b027456cfbef@myais.com.cn> Hi, Overall these patches seem mostly ok, but I've got a few minor points to make: - The usdot instruction requires the i8mm extension (part of armv8.6-a), while udot or sdot would require the dotprod extension (available in armv8.4-a). If you could manage with udot or sdot, these functions would be usable on a wider set of CPUs. Therefore, the current guards are wrong. Also, I finally got support implemented for optionally using these cpu extensions, even if the baseline of the compile don't include it, by runtime enabling it. See the patchset at https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=9009. To adapt your patches on top of this, see the two topmost commits at https://github.com/mstorsjo/ffmpeg/commits/archext. - The indentation is inconsistent; in the first patch, you have some instructions written like this: + sqadd v1.4s, v1.4s, v29.4s While you later use this style: + dup v1.16b, v28.b[1] The latter seems to match the style we commonly use; please reformat your code to match that consistently. With some macro invocations in the first patch, you also seem to have too much indentation in some places. See e.g. this: +1: ldr q23, [x2, x3] + add x2, x2, x3, lsl #1 + QPEL_FILTER_B v26, v16, v17, v18, v19, v20, v21, v22, v23 + QPEL_FILTER_B2 v27, v16, v17, v18, v19, v20, v21, v22, v23 + QPEL_UNI_W_V_16 + subs w4, w4, #1 + b.eq 2f (If the macro name is too long, that's ok, but here there's no need to have those lines unaligned.) - In the third patch, you've got multiple parameters from the stack like this: + ldp x14, x15, [sp] // mx, my + ldr w13, [sp, #16] // width I see that the mx an my parameters are intptr_t; that's good, since if they would be 32 bit integers, the ABI for such parameters on the stack differ between macOS/Darwin and Linux. But as long as they're intptr_t they behave the same. - At the same place, you're backing up a bunch of registers: + stp x20, x21, [sp, #-16]! + stp x22, x23, [sp, #-16]! + stp x24, x25, [sp, #-16]! + stp x26, x27, [sp, #-16]! + stp x28, x30, [sp, #-16]! This is inefficient; instead, do this: + stp x28, x30, [sp, #-80]! + stp x20, x21, [sp, #16] + stp x22, x23, [sp, #32] + stp x24, x25, [sp, #48] + stp x26, x27, [sp, #64] Also, following that, I see that you back up the stack pointer in x28. Why do you use x28 for that? Using x29 would be customary as frame pointer. Aside for that, I think the rest of the patches is acceptable. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-05-26 8:34 UTC|newest] Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-04-30 8:57 myais 2023-05-02 12:32 ` Jean-Baptiste Kempf 2023-05-03 2:14 ` myais 2023-05-04 8:49 ` Martin Storsjö 2023-05-05 15:27 ` myais 2023-05-07 4:52 ` myais 2023-05-26 8:34 ` Martin Storsjö [this message] 2023-05-27 8:34 ` myais 2023-05-27 20:24 ` Martin Storsjö 2023-05-28 3:23 ` Logan.Lyu [not found] ` <647df87f-b98e-4c18-9c94-f5cff44d11f0@app.fastmail.com> 2023-05-28 6:26 ` Logan.Lyu 2023-06-01 11:23 ` Martin Storsjö 2023-06-02 12:47 ` Logan.Lyu 2023-06-03 20:50 ` Martin Storsjö [not found] ` <d2d28c13-3b51-11e0-452b-9fc6ceb973d3@myais.com.cn> [not found] ` <973b066-585b-2610-66b1-6f533a1f7bb@martin.st> [not found] ` <ea6241be-d2fe-ea39-65ed-ec88c239f142@myais.com.cn> [not found] ` <a7c0fd97-a996-a70-b7e3-3c9c1d07860@martin.st> [not found] ` <fd6505e7-7d08-b79c-44c4-524107e21178@myais.com.cn> [not found] ` <579b3020-b044-6f54-a321-40b552edebf4@martin.st> 2023-05-27 8:03 ` myais
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4badda67-2eb-e262-f791-9d2847dbd71@martin.st \ --to=martin@martin.st \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git