From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 2441F49500 for ; Sat, 20 Sep 2025 20:01:55 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'WsaGjzBp02ukrtzbYUs/YQp7k2rM/eIkvEU7YimUQTc=', expected b'tQIa7NWSzOb1UsQvLVtKqk+BNsFS5vV0LGzzCalGqX4=')) header.d=ffmpeg.org header.i=@ffmpeg.org header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1758398489; h=mime-version : to : date : message-id : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : content-transfer-encoding : from; bh=WsaGjzBp02ukrtzbYUs/YQp7k2rM/eIkvEU7YimUQTc=; b=MIM8dq3J0/wTO0KGWjTfcfgxK+c/bSXaNSrxGDgDgQii5s+olA8yAkjNZCH+GymqT9+hl GkfZNjs/5dVLkGG46eZiEF0plxN7pHQdJSv4tN28f/aATJfrnCnWD7LGtKams+X/d6PziG+ k1cEyDRi8su++zkVrcbJHtSPz2O8H3Fl7s60PbDqnb0ob/6nFOC0JM+FOjx8MHLyeSey+Tl w+Pi6rcG0g4tkQFDfCZ/6ItNw++9bb0WjW6N0TJfLML6LJ71bjpIW0w1GZFpMKmqaEnxILR +aayZxd2JqJ0qTtQEEW/cI8yz43oiWD5o0vVnL9WS2DOrSVNXQvxxWIIaDEA== Received: from [172.19.0.4] (unknown [172.19.0.4]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 3BE9F68E844; Sat, 20 Sep 2025 23:01:29 +0300 (EEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1758398486; b=BgLupRosWwM2tGkP9JN9k4cQ0pgHx5drba/uEou4IEYcxidSuN0ZqOOgmSmsXxGECgc+p 1pV2cmEgHsZu9fDtQwRW4PFmlcNW+Kn4rFpVtEut5e/JgSwkePNYOX765CvEjeQPLGIr9Pb OdKBqEEr6Bo8gyyuR+lIzWKPyk743mbtamq+e5rR1ukBhofUt+dpHRNqW9oYexKeShFPF5+ Fs8tGABjQm+XPSSMYZjcIBd91HqCrFK83w2R9M85B7QD9OAnTfPmiz5SD1XQ+VTs8dynpN3 zJ9Uq6jesBgZNyO0m3xTfWFg+D+Sz2I1fz5YAmSxDGQ+8ey7/F/q0Ozkdg4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1758398486; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=WeDhK3iDncYlTXitxI4YRGW49/VHpUCibBga8WagWuA=; b=aH4cLz2Eq1uO30e2rIH2ckLQ5HFFC08TI1wdqNxnrTrS0sI4CmTDDCoQggq8PBxG+9K+e XzS6BzXieoklTr1Y/c048zBAcn3meZirl3GwXyfTgq27A5xA8bH0G2MZS4Cc6Fz/6RyWFMg gJ587HgQ2IASgOX5eEai6Iqr2n+sYx8gCOBb33ckm1/8a+V9bhkiNVVGoDnoee2UVqJYOn/ BJK2I9+ksE7g8QmuWwtHF2ttoX2DvfQ5xpLy8mr+L9ki7OuAWjHZfnylzJc9OxHiooY2KiB bdHFHQt7ESkeYHGqvOBMnBvYxBx/Qkkp/maLWfJdPPGl70mJQ4eK7/EibUig== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none; dmarc=none Authentication-Results: ffmpeg.org; dkim=pass header.d=ffmpeg.org header.i=@ffmpeg.org; arc=none (Message is not ARC signed); dmarc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1758398472; h=content-type : mime-version : content-transfer-encoding : from : to : reply-to : subject : date : from; bh=tQIa7NWSzOb1UsQvLVtKqk+BNsFS5vV0LGzzCalGqX4=; b=ULztlqr3NAfDtK7YbIVfGY20RLdVD+zn6AAlomoY6+JuWHfvLq8bv8nhvFtHt6/mjlVHc jfbvt9ueYz+ndsU29RGjM0KYvtwoXAmJ8NHMN/r/sIszpuyAuLhFm8Y1XBOwG7ZAlmiP3l6 T+djMy6scdICa6pKcwMz4hLLJIjhYqX8vAgfCLFmZnbUIBKA7VRMTCFrBi5B74AtDvNYNpr eBS7SNLlCJytrLYd2UdF1QY3EXXpi/ACZNt1xn11xq+F3sz95H8qQIXUOSJlTJiptGTiBmb vaacIlsZtedZB4KJLEXIquh1nFvMIL4RlG22Gfi1fj6eIWswTdTwnvFlct+w== Received: from ed19c606a818 (code.ffmpeg.org [188.245.149.3]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 9970768E7F1 for ; Sat, 20 Sep 2025 23:01:12 +0300 (EEST) MIME-Version: 1.0 To: ffmpeg-devel@ffmpeg.org Date: Sat, 20 Sep 2025 20:01:11 -0000 Message-ID: <175839847284.25.14950100753767509775@463a07221176> Message-ID-Hash: U3HEPCXIHOQFWMRHE4ZVH322V5DMBWT7 X-Message-ID-Hash: U3HEPCXIHOQFWMRHE4ZVH322V5DMBWT7 X-MailFrom: code@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH] avcodec/aarch64/vvc: Implement dmvr_v_8 (PR #20563) List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: welder via ffmpeg-devel Cc: welder Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Archived-At: List-Archive: List-Post: PR #20563 opened by welder URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20563 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20563.patch The primary optimization is to load the first row before entering the loop instead of loading two rows each iteration. >>From 832f354be2ae0e63e8c47dd1805225bdfff21851 Mon Sep 17 00:00:00 2001 From: Krzysztof Pyrkosz Date: Mon, 8 Sep 2025 20:56:24 +0200 Subject: [PATCH] avcodec/aarch64/vvc: Implement dmvr_v_8 A72 dmvr_v_8_12x20_neon: 207.0 ( 4.15x) dmvr_v_8_20x12_neon: 170.4 ( 4.37x) dmvr_v_8_20x20_neon: 273.4 ( 4.58x) A53 dmvr_v_8_12x20_neon: 450.6 ( 4.21x) dmvr_v_8_20x12_neon: 342.8 ( 3.70x) dmvr_v_8_20x20_neon: 550.9 ( 3.79x) --- libavcodec/aarch64/vvc/dsp_init.c | 2 ++ libavcodec/aarch64/vvc/inter.S | 56 +++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/libavcodec/aarch64/vvc/dsp_init.c b/libavcodec/aarch64/vvc/dsp_init.c index bdfa142a5a..b7dc1d89f8 100644 --- a/libavcodec/aarch64/vvc/dsp_init.c +++ b/libavcodec/aarch64/vvc/dsp_init.c @@ -101,6 +101,7 @@ DMVR_FUN(, 12) DMVR_FUN(h_, 8) DMVR_FUN(h_, 10) DMVR_FUN(h_, 12) +DMVR_FUN(v_, 8) DMVR_FUN(hv_, 8) DMVR_FUN(hv_, 10) DMVR_FUN(hv_, 12) @@ -195,6 +196,7 @@ void ff_vvc_dsp_init_aarch64(VVCDSPContext *const c, const int bd) c->inter.w_avg = vvc_w_avg_8; c->inter.dmvr[0][0] = ff_vvc_dmvr_8_neon; c->inter.dmvr[0][1] = ff_vvc_dmvr_h_8_neon; + c->inter.dmvr[1][0] = ff_vvc_dmvr_v_8_neon; c->inter.dmvr[1][1] = ff_vvc_dmvr_hv_8_neon; c->inter.apply_bdof = ff_vvc_apply_bdof_8_neon; diff --git a/libavcodec/aarch64/vvc/inter.S b/libavcodec/aarch64/vvc/inter.S index 01d2ff155c..d9c545ccb5 100644 --- a/libavcodec/aarch64/vvc/inter.S +++ b/libavcodec/aarch64/vvc/inter.S @@ -385,6 +385,62 @@ function ff_vvc_dmvr_12_neon, export=1 ret endfunc +function ff_vvc_dmvr_v_8_neon, export=1 + movrel x7, X(ff_vvc_inter_luma_dmvr_filters) + add x7, x7, x5, lsl #1 + ld2r {v0.16b, v1.16b}, [x7] + tbz w6, #4, 12f + + ldr s16, [x1, #16] + ld1 {v2.16b}, [x1], x2 +20: + ldr s17, [x1, #16] + umull v4.8h, v0.8b, v2.8b + umull2 v5.8h, v0.16b, v2.16b + ld1 {v3.16b}, [x1], x2 + umull v16.8h, v0.8b, v16.8b + umull v6.8h, v1.8b, v3.8b + umull2 v7.8h, v1.16b, v3.16b + add v4.8h, v4.8h, v6.8h + umull v18.8h, v1.8b, v17.8b + add v5.8h, v5.8h, v7.8h + urshr v4.8h, v4.8h, #2 + add v19.4h, v16.4h, v18.4h + urshr v5.8h, v5.8h, #2 + urshr v19.4h, v19.4h, #2 + st1 {v4.8h, v5.8h}, [x0], #32 + subs w3, w3, #1 + mov v2.16b, v3.16b + st1 {v19.4h}, [x0], #8 + mov v16.16b, v17.16b + add x0, x0, #(VVC_MAX_PB_SIZE * 2 - 32 - 8) + b.ne 20b + ret + +12: + ldr s16, [x1, #8] + ld1 {v2.8b}, [x1], x2 +2: + ldr s17, [x1, #8] + umull v4.8h, v0.8b, v2.8b + ld1 {v3.8b}, [x1], x2 + umull v16.8h, v0.8b, v16.8b + umull v6.8h, v1.8b, v3.8b + add v4.8h, v4.8h, v6.8h + umull v18.8h, v1.8b, v17.8b + srshr v4.8h, v4.8h, #2 + add v19.4h, v16.4h, v18.4h + srshr v19.4h, v19.4h, #2 + st1 {v4.8h}, [x0], #16 + subs w3, w3, #1 + mov v2.16b, v3.16b + st1 {v19.4h}, [x0], #8 + mov v16.16b, v17.16b + add x0, x0, #(VVC_MAX_PB_SIZE * 2 - 16 - 8) + b.ne 2b + ret +endfunc + function ff_vvc_dmvr_h_8_neon, export=1 movrel x7, X(ff_vvc_inter_luma_dmvr_filters) add x7, x7, x4, lsl #1 -- 2.49.1 _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org