From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 50359450AB for ; Wed, 1 May 2024 22:59:25 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 239B368D6EC; Thu, 2 May 2024 01:59:23 +0300 (EEST) Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05olkn2058.outbound.protection.outlook.com [40.92.91.58]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0519268D637 for ; Thu, 2 May 2024 01:59:15 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ELxvxrPbRiqwq30dAPeGygUPkARKLyNDLmGnWEdF9Y2GoLb8OPHfkPhob3BJplQdcFvVQ/d0yQimLydDFQkoEUqkLKeO6+9fxckbS7Bw49K/MuGPI+NFxjOY6zmQqiAXc7i3R3L7JwXJQs/7+tKGaKCMH/gfTTAOintBWyqmAs0KDBrdlojKwUvelvY6/I2FIWPSX0bjg7GAzqe4M9RJ5kbzB31iKzVSAL0OTLynQ2PvwrX2C1l/h+/C4YzMCk4F0UhRiGAHAXx+mrlnE6nF/ck4rjOvy2Zv6NW0GM876C6WC8MeSxR/LxZwjL2Zf8/ZwAy9s981TuTqBgoE6+5i8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FvfZ03NL/LTjhb3CFi4rI/oYXAq/8l5Z/U9cy73Yn9Y=; b=QzWG7sOdcJMBhVoaJxQkB23wMgQOmjZzhXN8CrUhy1/0EpBdZVh1rdAEKISqMD3cmWO3TwPYcLxRREK9Ifq6PjRVmAnb2EE4sj40XwHKJ7+trtftvSLe+0oiwhSvAkah//unNF9e20pDkNFKba3wz8vlEZ4RDq0tM+b76tZJRpalRVc1EelTQYEFGHceEH4wf3ePozWh+9wpOZ/I32EeOI/OsZ+TD5fHAPZ08sdbXOkUO7b5iexDJrzI5UO1SeHBQgsjA/71MqRaDHntrYuBaafBEhm2AmK9qbSuLYaYMF00OnA5F6A9lbQXHUTtl2Cqhy579Y59eMX8GATb/TQHOA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FvfZ03NL/LTjhb3CFi4rI/oYXAq/8l5Z/U9cy73Yn9Y=; b=Y8hztgTvTGPMIm+7pC8fMENRGZPqvbpxwyeeMutzhQuhmhix7/xpkR6e4DEhPh4uRzidx5CBBchXVxYC1V8A3gUU+TrEWVQ13VDBDUQ4F4eROWSTBnE4ZLeeKG3vpqsR3hG2QetQb1RmoxZyUnq3DkHQkO9KVerw6p58EoYAR0d9iqDYDxJduVHOS6Y7WPmMeYddfKXrwoL8NLIUcadn8he2MIUhILKXdcbzU8aw3t3e+rwnLm/+9eu/Or+0vEshZeX7/Ovp2s5EDX5q6x4Ne521t4VEU6WSDpXGI5fzwNTSC6qNrriH/ZCLjBzX2k3RVVBjIeGu2rOjVCXjT/GPDQ== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by AS1P250MB0477.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:4a6::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7519.31; Wed, 1 May 2024 22:59:14 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::1f29:8206:b8c3:45bb]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::1f29:8206:b8c3:45bb%3]) with mapi id 15.20.7519.031; Wed, 1 May 2024 22:59:14 +0000 Message-ID: Date: Thu, 2 May 2024 00:59:13 +0200 User-Agent: Mozilla Thunderbird To: ffmpeg-devel@ffmpeg.org References: <20240501224031.109294-2-chen.stonechen@gmail.com> Content-Language: en-US From: Andreas Rheinhardt In-Reply-To: <20240501224031.109294-2-chen.stonechen@gmail.com> X-TMN: [odR4jLM5E/127U1/8izZTQC78Bp08Mp5fcml7fwf814=] X-ClientProxiedBy: FR5P281CA0017.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:f1::9) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <3465826f-44eb-4847-a87b-4297da4daf94@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|AS1P250MB0477:EE_ X-MS-Office365-Filtering-Correlation-Id: 59d1461f-9352-49b1-7dbe-08dc6a325259 X-Microsoft-Antispam: BCL:0;ARA:14566002|461199019|3412199016|440099019; X-Microsoft-Antispam-Message-Info: Gj+Huvz6eJhXE1T+hoi+aVIwIX1cIHuMPH/3JZrA+Cyidjj6UBsnJ/gU9fGojMS0TPYIq5lKLtnSq/hiHgyKNRoK5v4KDbuIrEGG0+GMS2Sbc4xVxxRqPebE8aD0fyFquMO2p79tWfuD548F2n1600ttqzRC+KijMZ3jggzYWf4/17Q9/Q9I02LK+CS13AbqPPr0/2JqFcocDEmc2kixLWP4u4JLxfGX1EtQRsTPiOtRRK9GEPDO9MXPa7NR8qTnCd2AmUlNo8mzSvkQkUsqCQptibiCgtBOkADXRwyQm2sf4uf2iMqqPLZqUhu0iKAV4BmCJ1YYvymbDH/OKvGW8+qYsaF8qyeOUggZwbZ1Jqy6FELnUsRmZyfnvE9+5obBQmajYHxpEnUW8F2vNwlmke3OvbqqMxHA9E+pKivPV6Jlx+Lhu2GepW4POhnnCgDJeY4LLX84SmbxmPjQVeQEw7SooeYINaASJBjo/45ylpr+uR9+h/RVbLGWwQ16WLNeWx+k9kpCUAYQmSXucNDu+svWpduzEgP3J2fyCU4MPcS75N9SQkGyEa5++mSAsHUa X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?SUxMd0hJd1JDdFhvL2M0aitKZk5kZTR4Wk1rV1hxeVVhaE1qVE9YclRIOWxU?= =?utf-8?B?Z0xqVFo1TUNBOUQxNjBUL0x2UWZROWpOVWI3N2s5cW54cXBHcmcyMW1NcmI4?= =?utf-8?B?RldhZXUwWmo2VlNnRStpUi9kSzJzakhITThhdHdIbXhySlgrcWpoK1U0Mkhh?= =?utf-8?B?Ty9lVzFLdStsWXlJRnBMaHVWS2ltVmVTaVBsOUNwZ0ZoWk11V3Vudk1LVGJa?= =?utf-8?B?TzRMYmJ3aWxtc0ZGMU5YNlVYOE84OTZnOG5leEFXR092Y0JOVWVQZm9TMjV1?= =?utf-8?B?eE5LbHkzY3FDK2xRWGE4Y3NabC95TWFwclEwWGVva1R6Qit3ZC9DQTFKeUFZ?= =?utf-8?B?ZU9lQTVNK2NpMjdhQUZYR3BRRGx5YXFiQ1lLOVREYXJyNElkY0JLZXJUaTZM?= =?utf-8?B?emUzb0FOZVRSWHBwQmVPaTlxc1hUdE00QXpBRDdFWU4walJ1SEJuV2RvZTZ4?= =?utf-8?B?MGROK3VHSE9SL083bGpra3hidDRDVjlVR0pXYXJzSTQ0M1YvVFJXd1NYVXlQ?= =?utf-8?B?R2E0RXlSMlFmZmIwZE4xNXJIdVBneFBWUzRkb3hEdnRIZkhleUlrRHRzaXpq?= =?utf-8?B?cGc4WjQ2bHV2cG0zUUkrWHR4NW1xdVF5UTNYYWZBK0dibEdmWEJaWTY5SnJW?= =?utf-8?B?SFMzNGJuYnVISUJWTE9qZnNweTZPSVlQZ0l4bVl1RXUrL2ZQWFRMbUdOd0xo?= =?utf-8?B?QWJNYmM0cCt6aUtXbGMxdFJOR1FUa2tlTml6aUxkaSt2NUxEdStCL0FVQXBw?= =?utf-8?B?ZUhPNTVkOWEvbjRiL0YxSXNhTVEwNmlncWc3KzlFL0EyNDN4eGlHamkwaEgz?= =?utf-8?B?Q3UyZW45dC95ejZxR2hva095ckNxNzZxL08yZERnTkxva0tTMkNWdzRyZzJN?= =?utf-8?B?dEFxdDFlZ250SkI4cE53QlVWeXkrUUh2MmZENWQ4UEk4eFVGN0U3M1NtanpW?= =?utf-8?B?elQvR2QvMFE2d2QxZWxsek1Fd0thc1U4SjUzWFhFb0ZKa2F6ZGIwTHdtNjNS?= =?utf-8?B?TWwzdU9yanByN2t1aTUva0o4ZG9vbURSalFCSm91cWxmVS9CQWwyTHdKRFFD?= =?utf-8?B?Z3BuTk5QWGpCeGI4ZnZodUNUSVdUekVpQTZRTytqbW9JeXM3QVh1ZFZnZGhU?= =?utf-8?B?WGhCb0xvRCt0eEJzbzN6QzZ4TENKWjhLcWh1bTNJTHJXZkRQN0tjQ1NQRkhK?= =?utf-8?B?K0Vzd1lxRWpWd2lUWklXTkJNM05rUmdBVHVtTjlra24rWXJVbytuM2FCOUZm?= =?utf-8?B?NHdYaDVhdXRncS83djBKN1ZLZlE3MTRWa2RQcEdwRGR0NmdaalFSenBab1hj?= =?utf-8?B?ZlN4VE9rZ1F4ZVFEb2FXV3l1dmVTcUwxeFVrdFpDQWxoK2VyM0EyMUNzcTZh?= =?utf-8?B?TjBrVHhaRDFtZFBUR1kxRnpIaE84NEU1UW1WeFVFM1d4TDZNSWVZK0t6RThS?= =?utf-8?B?SGZkQ0c1MnFZSm9VTGQ3UTljQWZ5WFF3VExwTDg4VjZYaThBSVZtck9SVUhk?= =?utf-8?B?UXlYWmJJa2hwMCtuRGdvZHdvbFdJNFB2NG1yYVc4NFpMVVg2TVJIL3hNUEVS?= =?utf-8?B?Zm1sMk1Jc2l6VmtHKzFmZUhOS2JJN1J2eElQbzdNUkx1dWNqWTJkNU5zM3Aw?= =?utf-8?B?MlFjN1dGVmJxRGhjVzdVN0diYmR2WUVNMFJqQzFkVWtScFcvcWtTeGtkV3Qy?= =?utf-8?B?Yi94TU5LNW5LUlJLQUltb1hBOHhrekpUREU4K0cwQWJSa29XN3VQRVRnPT0=?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 59d1461f-9352-49b1-7dbe-08dc6a325259 X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 May 2024 22:59:14.3851 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS1P250MB0477 Subject: Re: [FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (*sad) to (*sad[6]) to prepare for AVX2 funcs X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Stone Chen: > To prepare for adding AVX2 functions for different block widths, change VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also default initializes the pointer array with the scalar function and the calling sites to jump to the correct function based on block width. There's no change in functionality. > --- > libavcodec/vvc/dsp.h | 2 +- > libavcodec/vvc/inter.c | 4 ++-- > libavcodec/vvc/inter_template.c | 5 ++++- > 3 files changed, 7 insertions(+), 4 deletions(-) > > diff --git a/libavcodec/vvc/dsp.h b/libavcodec/vvc/dsp.h > index 9810ac314c..b06a3ef10e 100644 > --- a/libavcodec/vvc/dsp.h > +++ b/libavcodec/vvc/dsp.h > @@ -86,7 +86,7 @@ typedef struct VVCInterDSPContext { > > void (*apply_bdof)(uint8_t *dst, ptrdiff_t dst_stride, int16_t *src0, int16_t *src1, int block_w, int block_h); > > - int (*sad)(const int16_t *src0, const int16_t *src1, int dx, int dy, int block_w, int block_h); > + int (*sad[6])(const int16_t *src0, const int16_t *src1, int dx, int dy, int block_w, int block_h); > void (*dmvr[2][2])(int16_t *dst, const uint8_t *src, ptrdiff_t src_stride, int height, > intptr_t mx, intptr_t my, int width); > } VVCInterDSPContext; > diff --git a/libavcodec/vvc/inter.c b/libavcodec/vvc/inter.c > index 4a8d1d866a..a68f4f9452 100644 > --- a/libavcodec/vvc/inter.c > +++ b/libavcodec/vvc/inter.c > @@ -742,7 +742,7 @@ static void dmvr_mv_refine(VVCLocalContext *lc, MvField *mvf, MvField *orig_mv, > fc->vvcdsp.inter.dmvr[!!my][!!mx](tmp[i], src, src_stride, pred_h, mx, my, pred_w); > } > > - min_sad = fc->vvcdsp.inter.sad(tmp[L0], tmp[L1], dx, dy, block_w, block_h); > + min_sad = fc->vvcdsp.inter.sad[av_log2(block_w) - 2](tmp[L0], tmp[L1], dx, dy, block_w, block_h); > min_sad -= min_sad >> 2; > sad[dy][dx] = min_sad; > > @@ -752,7 +752,7 @@ static void dmvr_mv_refine(VVCLocalContext *lc, MvField *mvf, MvField *orig_mv, > for (dy = 0; dy < SAD_ARRAY_SIZE; dy++) { > for (dx = 0; dx < SAD_ARRAY_SIZE; dx++) { > if (dx != sr_range || dy != sr_range) { > - sad[dy][dx] = fc->vvcdsp.inter.sad(lc->tmp, lc->tmp1, dx, dy, block_w, block_h); > + sad[dy][dx] = fc->vvcdsp.inter.sad[av_log2(block_w) - 2](lc->tmp, lc->tmp1, dx, dy, block_w, block_h); > if (sad[dy][dx] < min_sad) { > min_sad = sad[dy][dx]; > min_dx = dx; > diff --git a/libavcodec/vvc/inter_template.c b/libavcodec/vvc/inter_template.c > index e2fbfd4fc0..545e8dd184 100644 > --- a/libavcodec/vvc/inter_template.c > +++ b/libavcodec/vvc/inter_template.c > @@ -458,7 +458,10 @@ static void FUNC(ff_vvc_inter_dsp_init)(VVCInterDSPContext *const inter) > inter->apply_prof_uni_w = FUNC(apply_prof_uni_w); > inter->apply_bdof = FUNC(apply_bdof); > inter->prof_grad_filter = FUNC(prof_grad_filter); > - inter->sad = vvc_sad; > + > + for (int i = 0; i < FF_ARRAY_ELEMS(inter->sad); i++) { > + inter->sad[i] = vvc_sad; > + } > } > > #undef FUNCS Why is the jump depending upon block width not performed inside your avx2 implementation? - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".