From: Nuo Mi <nuomi2021@gmail.com> To: "Martin Storsjö" <martin@martin.st> Cc: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH] checkasm: add support for vvc alf Date: Tue, 28 Feb 2023 23:06:36 +0800 Message-ID: <CAFXK13cdMJp61KC3ZTnGMVd6YS=JtmcwTyfBOC3QmPTnQjc65g@mail.gmail.com> (raw) In-Reply-To: <64cc66f8-a2fd-20f1-543-cab5403bfc2f@martin.st> On Tue, Feb 28, 2023 at 7:51 PM Martin Storsjö <martin@martin.st> wrote: > On Sun, 26 Feb 2023, Nuo Mi wrote: > > > +#include <string.h> > > + > > +#include "libavutil/intreadwrite.h" > > +#include "libavutil/mem_internal.h" > > + > > +#include "libavcodec/avcodec.h" > > + > > +#include "libavcodec/vvcdsp.h" > > +#include "libavcodec/vvcdec.h" > > + > > +#include "checkasm.h" > > + > > +static const uint32_t pixel_mask[3] = { 0xffffffff, 0x03ff03ff, > 0x0fff0fff }; > > + > > +#define SIZEOF_PIXEL ((bit_depth + 7) / 8) > > +#define PIXEL_STRIDE (ALF_SUBBLOCK_SIZE + 2 * ALF_PADDING_SIZE) > > +#define BUF_SIZE (PIXEL_STRIDE * (MAX_CTU_SIZE + 3 * 2) * 2) //+3 * 2 > for top and bottom row, *2 for high bit depth > > +#define LUMA_PARAMS_SIZE (ALF_SUBBLOCK_SIZE / 4 * ALF_SUBBLOCK_SIZE / 4 > * ALF_NUM_COEFF_LUMA) > > + > > +#define randomize_buffers(buf0, buf1, size) \ > > + do { \ > > + uint32_t mask = pixel_mask[(bit_depth - 8) >> 1]; \ > > + int k; \ > > + for (k = 0; k < size; k += 4) { \ > > + uint32_t r = rnd() & mask; \ > > + AV_WN32A(buf0 + k, r); \ > > + AV_WN32A(buf1 + k, r); \ > > + } \ > > + } while (0) > > + > > +#define randomize_buffers2(buf, size, filter) \ > > + do { \ > > + int k; \ > > + if (filter) { \ > > + for (k = 0; k < size; k++) { \ > > + uint8_t r = rnd(); \ > > + buf[k] = r; \ > > + } \ > > + } else { \ > > + for (k = 0; k < size; k++) { \ > > + int16_t r = rnd(); \ > > + buf[k] = r; \ > > + } \ > > + } \ > > + } while (0) > > I don't quite see the point of the extra uint8_t/int16_t variable r here - > you could just as well assign it directly to buf[k], no? Unless you > specifically want the effect where you're narrowing the random value to a > smaller range beforehand. > yes. clips can only get special values. https://github.com/ffvvc/FFmpeg/blob/main/libavcodec/vvc_filter_template.c#L653 fixed by https://github.com/ffvvc/FFmpeg/pull/42/commits/3034ea8050f7f2db580ef4a36bd806e16be943d4#diff-99295cc07508b29c356b6432f8dbd72c5dceb37584fb8b4be0d34da19455864dR61 > > > + > > +static void check_alf_luma_filter(VVCDSPContext *c, const int bit_depth) > > +{ > > + LOCAL_ALIGNED_32(uint8_t, dst0, [BUF_SIZE]); > > + LOCAL_ALIGNED_32(uint8_t, dst1, [BUF_SIZE]); > > + LOCAL_ALIGNED_32(uint8_t, src0, [BUF_SIZE]); > > + LOCAL_ALIGNED_32(uint8_t, src1, [BUF_SIZE]); > > + int8_t filter[LUMA_PARAMS_SIZE]; > > + int16_t clip[LUMA_PARAMS_SIZE]; > > + ptrdiff_t stride = PIXEL_STRIDE * SIZEOF_PIXEL; > > + int offset = (3 * PIXEL_STRIDE + 3) * SIZEOF_PIXEL; > > + > > + declare_func_emms(AV_CPU_FLAG_AVX2, void, uint8_t *dst, ptrdiff_t > dst_stride, const uint8_t *src, ptrdiff_t src_stride, > > + int width, int height, const int8_t *filter, const int16_t > *clip); > > + > > + randomize_buffers(src0, src1, BUF_SIZE); > > + randomize_buffers2(filter, LUMA_PARAMS_SIZE, 1); > > + randomize_buffers2(clip, LUMA_PARAMS_SIZE, 2); > > Here, both invocations of randomize_buffers2 are called with filter=1 or > 2, both which will pick the "if (filter) {" case, so the else in > randomize_buffers2 is unused? > fixed > > > + > > + for (int h = 4; h < ALF_SUBBLOCK_SIZE; h += 4) { > > + for (int w = 4; w < ALF_SUBBLOCK_SIZE; w += 4) { > > Wouldn't you want to use <= instead of < for the comparisons here? I don't > know vvc so I can't say for sure, but that would seem logical to me. > you are right, fixed. > > Are all aspect ratio combinations allowed here? E.g. the log mentions that > you're testing 28x4 blocks. In dav1d, similar cases use logic of testing w > in the range of [h/4, h*4] (plus limited to e.g. [4,64] at the same time). > That would reduce the number of combinations to test, if they aren't > really valid in practice. > yes. almost all 4x width and height are valid. last col or row may have a small width and height. > > > + if (check_func(c->alf.filter[LUMA], > "vvc_alf_filter_luma_%dx%d_%d", w, h, bit_depth)) { > > + memset(dst0, 0, BUF_SIZE); > > + memset(dst1, 0, BUF_SIZE); > > + call_ref(dst0, stride, src0 + offset, stride, w, h, > filter, clip); > > + call_new(dst1, stride, src1 + offset, stride, w, h, > filter, clip); > > + for (int i = 0; i < h; i++) { > > + if (memcmp(dst0 + i * stride, dst1 + i * stride, w > * SIZEOF_PIXEL)) > > + fail(); > > + } > > + bench_new(dst1, stride, src1 + offset, stride, w, h, > filter, clip); > > + } > > + if (check_func(c->alf.filter[CHROMA], > "vvc_alf_filter_chroma_%dx%d_%d", w, h, bit_depth)) { > > + memset(dst0, 0, BUF_SIZE); > > + memset(dst1, 0, BUF_SIZE); > > + call_ref(dst0, stride, src0 + offset, stride, w, h, > filter, clip); > > + call_new(dst1, stride, src1 + offset, stride, w, h, > filter, clip); > > + for (int i = 0; i < h; i++) { > > + if (memcmp(dst0 + i * stride, dst1 + i * stride, w > * SIZEOF_PIXEL)) > > + fail(); > > + } > > + bench_new(dst1, stride, src1 + offset, stride, w, h, > filter, clip); > > + } > > + } > > + } > > +} > > + > > +void checkasm_check_vvc_alf(void) > > +{ > > + int bit_depth = 10; > > + VVCDSPContext h; > > + ff_vvc_dsp_init(&h, bit_depth); > > + check_alf_luma_filter(&h, bit_depth); > > + report("alf_filter"); > > Is 10 bits the only currently valid bitdepth here? Otherwise I think we > should include all of them, even if there's only assembly for one case at > the moment. (The pixel_mask seems to assume 8/10/12 bits.) > I use the 10 bits version to collect performance suggestions. After it done, I will add 8 and 12 bits version But you are right. I can add 8/12 bits test first. > > // Martin > > Hi Martin, Thank you for the review. comments inline. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
prev parent reply other threads:[~2023-02-28 15:06 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-02-26 6:09 Nuo Mi 2023-02-28 11:51 ` Martin Storsjö 2023-02-28 15:06 ` Nuo Mi [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAFXK13cdMJp61KC3ZTnGMVd6YS=JtmcwTyfBOC3QmPTnQjc65g@mail.gmail.com' \ --to=nuomi2021@gmail.com \ --cc=ffmpeg-devel@ffmpeg.org \ --cc=martin@martin.st \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git