From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 2938D421C0 for ; Tue, 29 Mar 2022 12:25:07 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0547B68B23C; Tue, 29 Mar 2022 15:25:05 +0300 (EEST) Received: from mail8.parnet.fi (mail8.parnet.fi [77.234.108.134]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5777F68A906 for ; Tue, 29 Mar 2022 15:24:58 +0300 (EEST) Received: from mail9.parnet.fi (mail9.parnet.fi [77.234.108.21]) by mail8.parnet.fi with ESMTP id 22TCOvsd021316-22TCOvse021316; Tue, 29 Mar 2022 15:24:57 +0300 Received: from foo.martin.st (host-97-187.parnet.fi [77.234.97.187]) by mail9.parnet.fi (Postfix) with ESMTPS id 43E86A143A; Tue, 29 Mar 2022 15:24:57 +0300 (EEST) Date: Tue, 29 Mar 2022 15:24:57 +0300 (EEST) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: FFmpeg development discussions and patches In-Reply-To: <20220325185257.513933-2-bavison@riscosopen.org> Message-ID: <8ac45f-b8dd-3b1e-8335-a8791bdca99e@martin.st> References: <20220317185819.466470-1-bavison@riscosopen.org> <20220325185257.513933-1-bavison@riscosopen.org> <20220325185257.513933-2-bavison@riscosopen.org> MIME-Version: 1.0 X-FE-Policy-ID: 3:14:2:SYSTEM Subject: Re: [FFmpeg-devel] [PATCH 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Ben Avison Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Fri, 25 Mar 2022, Ben Avison wrote: > Note that the benchmarking results for these functions are highly dependent > upon the input data. Therefore, each function is benchmarked twice, > corresponding to the best and worst case complexity of the reference C > implementation. The performance of a real stream decode will fall somewhere > between these two extremes. Great idea to do separate benchmarking of the best/worst cases like this - that is usually a recurring issue in benchmarking loop filters. (Another issue with benchmarking of loop filters, is that the same function is run repeatedly without resetting the input data inbetween - so depending on the exact setup, it's possible that the decision about whether to filter or not is taken differently in the first and last runs. But this implementation seems very good in that aspect!) > +++ b/tests/checkasm/vc1dsp.c > @@ -0,0 +1,94 @@ > +/* > + * Copyright (c) 2022 Ben Avison > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License along > + * with FFmpeg; if not, write to the Free Software Foundation, Inc., > + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. > + */ > + > +#include > + > +#include "checkasm.h" > + > +#include "libavcodec/vc1dsp.h" > + > +#include "libavutil/common.h" > +#include "libavutil/internal.h" > +#include "libavutil/intreadwrite.h" > +#include "libavutil/mem_internal.h" > + > +#define RANDOMIZE_BUFFER8_MID_WEIGHTED(name, size) \ > + do { \ > + uint8_t *p##0 = name##0, *p##1 = name##1; \ > + int i = (size); \ > + while (i-- > 0) { \ > + int x = 0x80 | (rnd() & 0x7F); \ > + x >>= rnd() % 9; \ > + if (rnd() & 1) \ > + x = -x; \ > + *p##1++ = *p##0++ = 0x80 + x; \ > + } \ > + } while (0) > + > +#define CHECK_LOOP_FILTER(func) \ > + do { \ > + if (check_func(h.func, "vc1dsp." #func)) { \ > + declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *, int, int); \ > + for (int count = 1000; count > 0; --count) { \ > + int pq = rnd() % 31 + 1; \ > + RANDOMIZE_BUFFER8_MID_WEIGHTED(filter_buf, 24 * 24); \ > + call_ref(filter_buf0 + 4 * 24 + 4, 24, pq); \ > + call_new(filter_buf1 + 4 * 24 + 4, 24, pq); \ > + if (memcmp(filter_buf0, filter_buf1, 24 * 24)) \ > + fail(); \ > + } \ > + } \ > + for (int j = 0; j < 24; ++j) \ > + for (int i = 0; i < 24; ++i) \ > + filter_buf1[24*j + i] = 0x60 + 0x40 * (i >= 4 && j >= 4); \ > + if (check_func(h.func, "vc1dsp." #func "_bestcase")) { \ > + declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *, int, int); \ > + bench_new(filter_buf1 + 4 * 24 + 4, 24, 1); \ > + (void) checked_call; \ > + } \ > + if (check_func(h.func, "vc1dsp." #func "_worstcase")) { \ > + declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *, int, int); \ > + bench_new(filter_buf1 + 4 * 24 + 4, 24, 31); \ > + (void) checked_call; \ > + } \ > + } while (0) > + > +void checkasm_check_vc1dsp(void) > +{ > + /* Deblocking filter buffers are big enough to hold a 16x16 block, > + * plus 4 rows/columns above/left to hold filter inputs (depending on > + * whether v or h neighbouring block edge) plus 4 rows/columns > + * right/below to catch write overflows */ > + LOCAL_ALIGNED_4(uint8_t, filter_buf0, [24 * 24]); > + LOCAL_ALIGNED_4(uint8_t, filter_buf1, [24 * 24]); > + > + VC1DSPContext h; > + > + ff_vc1dsp_init(&h); > + > + CHECK_LOOP_FILTER(vc1_v_loop_filter4); > + CHECK_LOOP_FILTER(vc1_h_loop_filter4); > + CHECK_LOOP_FILTER(vc1_v_loop_filter8); > + CHECK_LOOP_FILTER(vc1_h_loop_filter8); > + CHECK_LOOP_FILTER(vc1_v_loop_filter16); > + CHECK_LOOP_FILTER(vc1_h_loop_filter16); > + > + report("loop_filter"); > +} This looks great to me overall. I think it'd be nice to unmacro CHECK_LOOP_FILTER though and make a separate check_loopfilter() function like in vp8dsp.c instead, and move the declare_func_emms outside of check_func() as you concluded. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".