From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 8E18F4AE65 for ; Wed, 22 May 2024 05:02:50 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 98C3168D26B; Wed, 22 May 2024 08:02:47 +0300 (EEST) Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04olkn2060.outbound.protection.outlook.com [40.92.75.60]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1AB6268D208 for ; Wed, 22 May 2024 08:02:40 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hHjjMLC3Rov8BKG+nb8is3AkEsd17o6akzI/UZQuPp+g70xhy9ZzhejvLJpNDwZSuTljkz6zR1puSz+ES3Cf6O0BYvsOM9pGrk+CBXHoRmQyazo9GCsZzJDDSToaZvbAzU06UQBeUnngEA54D2GrFLZGeY0AljAhxEUsYaP6VoJw3B5N6D3vX4BTOZ/9kX7ONxSib0aRpuUhwWVnx/ihEnsQPUWr6HAKG8dB7CNv9BI09wDIBdKas3i/0K+dzgnWlkQgNZSMz6vY2tEKOtYmy4hnFG6pd4PVoZu5I8xWZD3j4xsTZ8ofdrIsgOHOchxgOq0FKFZj4u4MJvvUqb8s9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dJBQmbCCSArS+M7/phCdG/hOpbBGbmyHxW6GpXyyPDE=; b=WLVZuq+S8TWwE1z5rJAtr21FEQ74aE2SN0puw0TBZQyebYb80nhrA+U2b0WaRyveVHlWPBnULDGdgBZ67XVm6zZOrEdT2OwQ8OewKmk289V8wlVP0qK0/lSQqTJt+mfD2FBW52wDesf43Nb+Uz5CmAsQfFvPeFgP4+OPZ4iVnlRmLnUPDLqbgVLq8U/A4uaHSvOSg659kYXuy3FUebQvRWw+SuY0Kwnam6E9BIwMuh09tQzhVgz2B9ANACvNzMBsDuyN1wtsm3+bXDM8lGpaFKkFXWFS1n5L3u6WESpa9yuSDyLmLYtevj6+45szxHP2+LzY483cVtmRvuLBw8CX7w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dJBQmbCCSArS+M7/phCdG/hOpbBGbmyHxW6GpXyyPDE=; b=X3Ylr0AdhtGbkoQN8rDYw97dOY21fX0jfskitSyoiH0AgYqtOjrXmy0H2xdJDKDW8qcQM/54r+URlyCceFHr4jKWPSF7qRb0Y5OgCeMtjfqeut7RSBmvD9MiZUPlyEFb9s2XGtbNm3SiupKrs4Z+ebLnF30A8EA7xswN7gZlGe7oxD84JBYzeLyXxhO5jZnOU6GHM9vGB8visCoumo59p7RO/uZ3P6iOUCvewvwefSUmmTNxnvc+GfIJmdDmhceYnvafgGIxLs/LhyZMqeY4lcMM7qQiuv5FznRa5bRCoHpFysd2dKhwH71zRbPUxbCjz3tl+yP3Z0Z4FPvCPRiiew== Received: from GV1P250MB0737.EURP250.PROD.OUTLOOK.COM (2603:10a6:150:8e::17) by AS8P250MB0345.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:37c::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7587.36; Wed, 22 May 2024 05:02:39 +0000 Received: from GV1P250MB0737.EURP250.PROD.OUTLOOK.COM ([fe80::d6a1:e3af:a5f1:b614]) by GV1P250MB0737.EURP250.PROD.OUTLOOK.COM ([fe80::d6a1:e3af:a5f1:b614%3]) with mapi id 15.20.7587.030; Wed, 22 May 2024 05:02:39 +0000 Message-ID: Date: Wed, 22 May 2024 07:02:36 +0200 User-Agent: Mozilla Thunderbird To: ffmpeg-devel@ffmpeg.org References: <20240522000039.34913-2-chen.stonechen@gmail.com> Content-Language: en-US From: Andreas Rheinhardt In-Reply-To: <20240522000039.34913-2-chen.stonechen@gmail.com> X-TMN: [LK/VF08XEiS7AOW8P1BaA2F86nMO7bx+VSU71OYmxKo=] X-ClientProxiedBy: ZR2P278CA0039.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:47::16) To GV1P250MB0737.EURP250.PROD.OUTLOOK.COM (2603:10a6:150:8e::17) X-Microsoft-Original-Message-ID: MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: GV1P250MB0737:EE_|AS8P250MB0345:EE_ X-MS-Office365-Filtering-Correlation-Id: 989de2d2-e613-4f52-4554-08dc7a1c675e X-Microsoft-Antispam: BCL:0;ARA:14566002|461199019|440099019|3412199016; X-Microsoft-Antispam-Message-Info: aO4sQ/BraS+WyVGM1E6h7cBUc2Tl3PzMsZFfU2NUcPx6mheQafXNHB7mzqO/9gV6DY77onn1yk7hIm9YbptrLcq3O/dTyd2M3guECnzmkTal6FTlm9W83cOR+ZfV6lq+zek7YVJa5rv6ajzCMc6lQmpg+aroxiWaFUN45ddg0j329Slc8RDl11I43a3YWHojKrywW6kgDhJbpkWM4XxGmxOagYEmXmSPE5VKR/m/DoKRH3SU/6QTzA5M+XNG1uy3PEMHrYRidgYGI8V/4k8b2VoIkaLTf8Fs7Rd7ukvwFTkTS7QzSFpfuzorOhNTMDoL7QAOSZ9JzSPxMEQhdzOxO2E5CzWeZTM1/x6lrTxsfET6fALE2b1ya7HqRNfyCeitDgoERLX6dMNWZAhBaD1op97/IV8o0vgWzPXnvBKtdSu/MCLeWWKtaN8xCycdE1RKZ239SOnzER3Eyx8hmsjGVtrDeTUnft7JkakqK2BnLND97duTa8Xp9wxE4AKUAG1lLOIl7JXmFUJu6q/9z3sj39yyjmNuY4Qjys3e0mBw9Kyu/VIm2uU+V/hk45mgnDNx X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?R0U5QkFLTVZLYTE1ZlRtVkxuSHJmT2NzQ21IS1FNSldHNk96R3h2N0ZpOXVG?= =?utf-8?B?UnlUcFhwUGh3SW9ja0dZdTFYRURCcUFjRXR3ZUZlNGhhTmp2L1ZOR2JMUC94?= =?utf-8?B?cjQyYlA5LzFaeUFFd0hrYXZCZ0FCSWN3RlUxZXdxYytMenJqdWoxVHhlVEh0?= =?utf-8?B?QU04R0xtdTgzWVFVZzRNVHFnUTBTNkFVZEp0L0NHVEFqUk92MnRPalkvRzVS?= =?utf-8?B?OS9La0QrMkdqTWZpM2R3ZVMwTzlPSnd3YjJ3bDZYRWlzeGM2Wm9NL21LWElz?= =?utf-8?B?M2lqWDVKTFhWemZ5WmFwSzJtOWtMQzhLdGxyamJBdDdJT0trTHZhZzdQSDZR?= =?utf-8?B?R3ZRWmw4S3lzeE5LbTZydk83OGlCRlFqcDEvTjZLbDl0WExuZ2lMUTlpS1Vl?= =?utf-8?B?bTVTRWdPcktXaFZkanVIVDRCWHJNSEl1aXhtbWxNTEVVSEU1NHh3RHdab1N5?= =?utf-8?B?cEYzbGxvMDljRnpLMmZ4VFRKMERkeGhQTDNKMndYcVdKNVAxR2RHR2w2RmJn?= =?utf-8?B?aWhVTTA5NFNyZm1obFY0RkhLckplSm5ScWtjWXBEd1pZQXMvdnlLbkZUbW5O?= =?utf-8?B?SFdySVFMazducTl3dmZlRXoyR29GN3JhK3RnQS9WZXdveTBmam45MmNIbVVk?= =?utf-8?B?OTdJSmZVU091SFdNcDVob0JxWGl3bmRLY2tXYTFMNmVMWjdMZWpXSzllNXV4?= =?utf-8?B?ekJOeDhCYXE5YUR6VkZaV2pjSHRFUlZlUXcra3JyOHgrUTJQQ1VxQkxMTW5i?= =?utf-8?B?anFqaHF6cXRPQjlEZVdFSDAwcUpPdnlHK3F1d0FjSFpwWGQ2TE5xM09iQXo5?= =?utf-8?B?bDFMRzlMREZlYStnVXBJVjhrb2pXUkU0V3ZGWkFMV1cxQzcwWjJ1Tkg1MHM3?= =?utf-8?B?REtYOS9RVjZEekRqRkZkcjNUclJxUGVmaVlqOXU3MlJ4bnpwRTBveFdHVFJF?= =?utf-8?B?UmdIbnlSSUptaWFIT2RXODUxaDlOUkJOYmJGWExQZVhYb2wxWk52VC84dG9o?= =?utf-8?B?eXhUKzlnY2xTWURBTDFlMW1zY3B4TzNlS1FrWjlDSkhLbU1VbTJWa3BKMTN2?= =?utf-8?B?TW1PMDZmVUk1Mis2aElkNE9lYlRkVy9vVDAwanpkM2hkdk4zNlArVisvT1o5?= =?utf-8?B?djVlK2JyQS85NWk0alpuVkh6ZGJuRk5NV1ZZMVhUQmpRR2Q1cy9NaUdLK0Uv?= =?utf-8?B?NlF4SVJSRXk2QVU3eDEwekVqZ3BtZ0NhRFZ0a0tzbzg5anErcG9YY3JIdnF5?= =?utf-8?B?NEtJcFdkL0NScXFRUVY5cjA3eU90a0VuRVMrVURieW9KQWhENGxvQlRNQk56?= =?utf-8?B?VWs0RWhjZmVVYlMwaUg3NWNtUE5IanRRVi9iekM5ZUp4VGVUSGZkY3NpSVg4?= =?utf-8?B?MUJpMng4cEZMbDRxQmRtZitCN0Y2S092R0tyREs5dDNLWGo2OG04aDJ0SEkr?= =?utf-8?B?bmxNZDhrUUpZTXd5SzZWa1drb1A1R3JXSXNiSDArMzd2Z291WDFqUUZod2Z0?= =?utf-8?B?d0NiaHRNaEo4NnpEU0JVR1dxQTVqeG9yQXZESTFHTDFWejc5dU83anBNUlNr?= =?utf-8?B?UzdPYWE1V3hhZUcxNXFaT1R2WnJac0YrbXA2UjhmTXJkOHhZenpUb0RzY2tn?= =?utf-8?B?dk5GeWVGQ2VlWjVDTlVQUUlmKzhLRkxKL2xlZUZnbGlPTXM5TkdDRFpOSTAw?= =?utf-8?B?MFFJS0Ewc1JXb3AvejRkWW82dE9FMUdic3p0aXVhSHlEUFJscTJZVExRPT0=?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 989de2d2-e613-4f52-4554-08dc7a1c675e X-MS-Exchange-CrossTenant-AuthSource: GV1P250MB0737.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 May 2024 05:02:39.4417 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8P250MB0345 Subject: Re: [FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Stone Chen: > Implements AVX2 DMVR (decoder-side motion vector refinement) SAD functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. To reduce complexity, SAD is only calculated on even rows. This is calculated for all video bitdepths, but the values passed to the function are always 16bit (even if the original video bitdepth is 8). The AVX2 implementation uses min/max/sub. > > Additionally this changes parameters dx and dy from int to intptr_t. This allows dx & dy to be used as pointer offsets without needing to use movsxd. > > Benchmarks ( AMD 7940HS ) > Before: > BQTerrace_1920x1080_60_10_420_22_RA.vvc | 106.0 | > Chimera_8bit_1080P_1000_frames.vvc | 204.3 | > NovosobornayaSquare_1920x1080.bin | 197.3 | > RitualDance_1920x1080_60_10_420_37_RA.266 | 174.0 | > > After: > BQTerrace_1920x1080_60_10_420_22_RA.vvc | 109.3 | > Chimera_8bit_1080P_1000_frames.vvc | 216.0 | > NovosobornayaSquare_1920x1080.bin | 204.0| > RitualDance_1920x1080_60_10_420_37_RA.266 | 181.7 | > --- > libavcodec/vvc/dsp.c | 2 +- > libavcodec/vvc/dsp.h | 2 +- > libavcodec/x86/vvc/Makefile | 3 +- > libavcodec/x86/vvc/vvc_sad.asm | 130 +++++++++++++++++++++++++++++++ > libavcodec/x86/vvc/vvcdsp_init.c | 6 ++ > 5 files changed, 140 insertions(+), 3 deletions(-) > create mode 100644 libavcodec/x86/vvc/vvc_sad.asm > > diff --git a/libavcodec/x86/vvc/vvcdsp_init.c b/libavcodec/x86/vvc/vvcdsp_init.c > index 0e68971b2c..aa6c916760 100644 > --- a/libavcodec/x86/vvc/vvcdsp_init.c > +++ b/libavcodec/x86/vvc/vvcdsp_init.c > @@ -311,6 +311,9 @@ ALF_FUNCS(16, 12, avx2) > c->alf.filter[CHROMA] = ff_vvc_alf_filter_chroma_##bd##_avx2; \ > c->alf.classify = ff_vvc_alf_classify_##bd##_avx2; \ > } while (0) > + > +int ff_vvc_sad_avx2(const int16_t *src0, const int16_t *src1, intptr_t dx, intptr_t dy, int block_w, int block_h); > +#define SAD_INIT() c->inter.sad = ff_vvc_sad_avx2 You are adding an AVX2 function to an ARCH_X86_64 #if block. I expect this to lead to linking failures if AVX2 is disabled. > #endif > > void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const int bd) > @@ -327,6 +330,7 @@ void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const int bd) > ALF_INIT(8); > AVG_INIT(8, avx2); > MC_LINKS_AVX2(8); > + SAD_INIT(); > } > break; > case 10: > @@ -338,6 +342,7 @@ void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const int bd) > AVG_INIT(10, avx2); > MC_LINKS_AVX2(10); > MC_LINKS_16BPC_AVX2(10); > + SAD_INIT(); > } > break; > case 12: > @@ -349,6 +354,7 @@ void ff_vvc_dsp_init_x86(VVCDSPContext *const c, const int bd) > AVG_INIT(12, avx2); > MC_LINKS_AVX2(12); > MC_LINKS_16BPC_AVX2(12); > + SAD_INIT(); > } > break; > default: _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".