From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 2A0A74DC13 for ; Sat, 1 Mar 2025 22:35:11 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B8E3B68E1CF; Sun, 2 Mar 2025 00:35:07 +0200 (EET) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DFEAF68D718 for ; Sun, 2 Mar 2025 00:35:00 +0200 (EET) Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-30b8f0c514cso20212961fa.2 for ; Sat, 01 Mar 2025 14:35:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1740868500; x=1741473300; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=mEPNwRMdW0+rqg7SM9zwZ1VIB6tYT1Azrpu6vx1Q9JU=; b=WZFbd89yZgjimpj66DqJgJ+32FYP6WTdqAzdAbkt2F+U0yREpKoFycvh2UmHFbPbfv b7wOnlE4Jpneuoo77hyIgcLfTSfDVYUaypo/ZhPVOQPnWNW1ObG2xc555tISSSGFsK0Q tEiA81P+OFF0xSOMUuEI3gRoub0bN76v2HBTzkCRRCbFrggpqLD4B6iShg7+MgyDCKpK eYjNKE3vxP1lf7o/Yn0gDzKg3q+m9zjFbH+NSOibl458CK3+5SiPmoiGjHC+1fEt7+0v AiLvfGz7WXsuCoBq9UybXsDGlYNV0vFMPT0oAPu56TqMDgt2M4Ku0/vIr3qaCter0fQV Fw5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740868500; x=1741473300; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mEPNwRMdW0+rqg7SM9zwZ1VIB6tYT1Azrpu6vx1Q9JU=; b=dMWnLYczx4u5rfHsPXoN5p/s5yv1x4g9Z/NJ7Q3ybwiXQhJltyegqTD0DwXUkNHwg7 cI14DX6otHP+AOrvmYnJYUJ0Ut4JxLiH3O8mMr3vzRP0DgtrkZY2XK8smMzHuf4YWTxZ bCFUF7/GU7+wHHwN7ma2ccZj39GV6u5uWSRhcj2S1Ft0AMYLrLJI7HsgEt2dnKRCFEow RKTtsXEQjDY4am8dKkinCzzybYbZX/Uqf5pm9ajhY/2MjaR8ly3i2+gsdwGzvBoFPNUI 2r9m812uPBhsxZ1oQ8YGpq1UmAKIEKknlfNCfGleNQTpgxMBEXe5xh76Evx/h0G1ixAj amjw== X-Gm-Message-State: AOJu0Ywp3Bo7/86nUkg3zpBef0BJgtRH0b6qLz+21w5Coa0xwPY1Ly31 p9jQP36sLymVwJ5X1C70sG8aUzQvuuTG9a4+fr1KSt8ODSr8lPdHzpQcGY0xVUrV/ghZN4IjIra MPA== X-Gm-Gg: ASbGncv4+KdvIE4KJ5PThr8xUcnjfLU0qz7noMXR6smznADYi3N6w2mJmjSn+Ch3i1x LHuBW1Jr0lC9anpjRSHtUWHt6Xf38YgtX5TMwgpa2Pr/UGWrMAAfzrz68nQPIhkuA6SlmclxuaP TNa0P5IpSE1fx5QFhEA3+uDjlTe7zns0jLP7WnLhU4vKWMz78oIyDwoAKp9Cgedjz4OogRSzSdG GPJvrUKQFkqfkKSmiEobbR//8G2IgpbcBBJxo4kCWl0OD9ief3W0rjgkMFOpMVFQvMd3gtVKT5z jXhVM3nT51DNgY7wwQXThqvYaplsby6EJls9xZ0/wEp1y75GW/I3TDQvoKohLxuNwy1ghqXHZ4x ykUfUbnDBDrED59ioDoZ8CfdnQqhrKNmGofzshbjK X-Google-Smtp-Source: AGHT+IFgrFgObFBJRGV68Kd5uXR6wUKokXN2BZxBb2sD7EA+zdv5cmsM8gUv+yuSVOAaq7qGy+hsYw== X-Received: by 2002:a05:6512:398e:b0:545:3031:40aa with SMTP id 2adb3069b0e04-5494c129d52mr3123707e87.9.1740868499826; Sat, 01 Mar 2025 14:34:59 -0800 (PST) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5494f51fbe3sm601353e87.3.2025.03.01.14.34.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Mar 2025 14:34:59 -0800 (PST) Date: Sun, 2 Mar 2025 00:34:57 +0200 (EET) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: Krzysztof Pyrkosz via ffmpeg-devel In-Reply-To: <20250219174010.3911-4-ffmpeg@szaka.eu> Message-ID: References: <20250219174010.3911-2-ffmpeg@szaka.eu> <20250219174010.3911-4-ffmpeg@szaka.eu> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH 2/2] avcodec/aarch64/vvc: Use rounding shift NEON instruction X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Krzysztof Pyrkosz Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Wed, 19 Feb 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote: > --- > > Before and after on A78 > > dmvr_8_12x20_neon: 86.2 ( 6.90x) > dmvr_8_20x12_neon: 94.8 ( 5.93x) > dmvr_8_20x20_neon: 141.5 ( 6.50x) > dmvr_12_12x20_neon: 158.0 ( 3.76x) > dmvr_12_20x12_neon: 151.2 ( 3.73x) > dmvr_12_20x20_neon: 247.2 ( 3.71x) > dmvr_hv_8_12x20_neon: 423.2 ( 3.75x) > dmvr_hv_8_20x12_neon: 434.0 ( 3.69x) > dmvr_hv_8_20x20_neon: 706.0 ( 3.69x) > > dmvr_8_12x20_neon: 77.2 ( 7.70x) > dmvr_8_20x12_neon: 66.5 ( 8.49x) > dmvr_8_20x20_neon: 92.2 ( 9.90x) > dmvr_12_12x20_neon: 80.2 ( 7.38x) > dmvr_12_20x12_neon: 58.2 ( 9.59x) > dmvr_12_20x20_neon: 90.0 (10.15x) > dmvr_hv_8_12x20_neon: 369.0 ( 4.34x) > dmvr_hv_8_20x12_neon: 355.8 ( 4.49x) > dmvr_hv_8_20x20_neon: 574.2 ( 4.51x) > > libavcodec/aarch64/vvc/inter.S | 72 ++++++++++------------------------ > 1 file changed, 20 insertions(+), 52 deletions(-) > > diff --git a/libavcodec/aarch64/vvc/inter.S b/libavcodec/aarch64/vvc/inter.S > index c9d698ee29..45add44b6e 100644 > --- a/libavcodec/aarch64/vvc/inter.S > +++ b/libavcodec/aarch64/vvc/inter.S > @@ -369,22 +369,18 @@ function ff_vvc_dmvr_8_neon, export=1 > 1: > cbz w15, 2f > ldr q0, [src], #16 > - uxtl v1.8h, v0.8b > - uxtl2 v2.8h, v0.16b > - ushl v1.8h, v1.8h, v16.8h > - ushl v2.8h, v2.8h, v16.8h > + ushll v1.8h, v0.8b, #2 > + ushll2 v2.8h, v0.16b, #2 In addition to what's mentioned in the commit message, this bit is semantically a different one, so we should probably mention that in the commit message as well. If you're reposting patch 1/2 of this set, can you update the commit message on this one, to mention this (and move the measurements into the actual commit message). Other than that, this patch looks very good to me, thanks! // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".