From: Stone Chen <chen.stonechen@gmail.com> To: "Ronald S. Bultje" <rsbultje@gmail.com> Cc: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC Date: Tue, 21 May 2024 20:05:07 -0400 Message-ID: <CAHpaCCi-mpULkJ-uUBvoikgeWg_uoq9C0i9B_zo6jhgT-wjB_g@mail.gmail.com> (raw) In-Reply-To: <CAEEMt2k=qPBkTVVp-JHFMS9G=1eNjt0oReoVKbQP6ZokD74DjA@mail.gmail.com> On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje <rsbultje@gmail.com> wrote: > Hi, > > This is mostly good, the following is tiny nitpicks. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonechen@gmail.com> > wrote: > >> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2 >> > > The macro is only used once, so you could inline it in the calling > function. > >> >> + imul %5, 128 >> + imul %6, 128 >> > > I believe shl is typically preferred over imul for powers of two. > > >> + add %5, 2 >> + add %6, 2 >> > > And these can be integrated as a constant offset in the lea below (lea %1, > [%1 + %5 * 2 + 2 * 2], same for %2). > > >> + add %5, %3 >> + sub %6, %3 >> + >> + lea %1, [%1 + %5 * 2] >> + lea %2, [%2 + %6 * 2] > > [..] > >> +cglobal vvc_sad, 6, 11, 5, src1, src2, dx, dy, block_w, block_h, off1, >> off2, row_idx, dx2, dy2 >> + movsxd dx2q, dxd >> + movsxd dy2q, dyd >> > > If you change the argument type from int to intptr_t, this is not > necessary anymore. > > >> + vvc_sad_16_128: >> + .loop_height: >> + mov off1q, src1q >> + mov off2q, src2q >> + mov row_idxd, block_wd >> + sar row_idxd, 4 >> > > You could right-shift block_wd by 4 outside the loop (before .loop_height). > > Ronald > On Mon, May 20, 2024 at 11:53 AM Ronald S. Bultje <rsbultje@gmail.com> wrote: > Hi, > > one more, I forgot. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonechen@gmail.com> > wrote: > >> +pw_1: dw 1 >> > [..] > >> + vpbroadcastw m4, [pw_1] >> > > We typically suggest to use vpbroadcastd, not w (and then pw_1: times 2 dw > 1). agner shows that on e.g. Haswell, the former (d) is 1 uops with 5 > cycles latency, whereas the latter (w) is 3 uops with 7 cycles latency, or > more generally d is faster then w. > > Ronald > Hi Ronald, I've sent a v5 incorporating all the above, thank you for the feedback! -Stone _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2024-05-22 0:05 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-05-20 0:42 Stone Chen 2024-05-20 0:42 ` [FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c Stone Chen 2024-05-21 5:12 ` Rémi Denis-Courmont 2024-05-21 6:37 ` Martin Storsjö 2024-05-21 8:47 ` Rémi Denis-Courmont 2024-05-21 10:12 ` Martin Storsjö 2024-05-21 14:35 ` Ronald S. Bultje 2024-05-20 11:23 ` [FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC Ronald S. Bultje 2024-05-20 15:52 ` Ronald S. Bultje 2024-05-22 0:05 ` Stone Chen [this message] -- strict thread matches above, loose matches on Subject: below -- 2024-05-20 0:37 Stone Chen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAHpaCCi-mpULkJ-uUBvoikgeWg_uoq9C0i9B_zo6jhgT-wjB_g@mail.gmail.com \ --to=chen.stonechen@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ --cc=rsbultje@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git