From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id A397B4DD79 for ; Tue, 4 Mar 2025 08:36:20 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6D1B268ECD9; Tue, 4 Mar 2025 10:36:17 +0200 (EET) Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5B45568ECC0 for ; Tue, 4 Mar 2025 10:36:11 +0200 (EET) Received: by mail-lf1-f45.google.com with SMTP id 2adb3069b0e04-5493b5bc6e8so6309098e87.2 for ; Tue, 04 Mar 2025 00:36:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1741077370; x=1741682170; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=H8rUssCfN8QWtgg1ikkLDY9jAUAaLB3zRWLbcAjSjfs=; b=AbphjpwONOUDK22F3CnS4I53rIB2XApXSMk1FluEOgd3qdSxHj07aKYOyd+VscCvkz 1ReWSDmlrSzjzRGCEF1MuTq4Qc0lUMFvv6HQykwUnnWGT17spuQ0nbrP8KzF8d+qWrmp HraE3SWJtjkFFMLzbIWRt6+4rFA818tJRscFubmluSYlwOiVIpzpQy4Et5wIyndbUqFd bMbf6BnUPovxH48z61s7hkEIRweXbhqq1jiAd5bXjoRUHpRwh5QclditenLd+Hba9zHe JIEXNB/uvfjwPbX8x2zYvqXwsZQpwT2TfB8bX5UnOzteJI6ZKSv4lIQrG5WFdpSx1EKG AGQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741077370; x=1741682170; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H8rUssCfN8QWtgg1ikkLDY9jAUAaLB3zRWLbcAjSjfs=; b=VfjkqhBes+PXVRuhNs6noDkpkFcAmsNv60xSsaMHKYb9FJ98iNSYFw7RxMmwQMBVCt Fyvik8r85i2eEaAY+Je/IpVEObYwGZJTUyIA7BXB2u9lbIe5uUVeuTbn98tes9ObeWgg xnMWTpmNlo0E2DsHyCmvRvxM4IgkYm3JoVf7+y49FXLqsLAyuZJg3X+BbG+RYnSnDXQ6 4H/UpOsA69bawNeTIJBqYoDarNDue6fXp1QPNxlQhfm7PBq+JDOjeOkGvBTv87F1fIzV ASlUB0p27MJ76ikLmGGG7rqgVJaH8ozg22DFvB0zf3re1LMGcuh6G879bDkzsgotr6Vi uGAA== X-Gm-Message-State: AOJu0YwiYQDzJkEcdTzuHGITjgx9Sv9dN62EdXVUIlTsZ7lWE4h48Z2S 2MjY2dJ9tjsXS1n01JH0j7zAWW+ssUr6CSZ8TCDITEDbN0juGoNs666UBlS1HHQ3NVikb37srho ihw== X-Gm-Gg: ASbGncvGVDhdjbWxEXuUIKKi325evFMFYZWXmbvGYmwMxWEUFLkmq9WX6qNjiKQqYzm lgjPiUQuHFPjPl3PnwpySA7NHRoe51RWD3dmfFaV8x8mIe5wdTwbP26giOPwYeV4t1wR3SPUELG RsHtf9+xTxNHDWqKANZFTcmonfXstFHSRDjq3hPcfiLfyqeixdP4X/FSZrN3K52SwkRY1DwoHDb scfON0kpMKRJqJfekQOQk0SjDP2jqbpsbNoZYKC7slKb0N82PbeEvU1bcVbgOubC0A+IyXARyx3 TcQ+k6eFLiuA8yubqiG5bdMZ6b8YzFCP478+g12N+eLq+dKUvyxA7DsK090bj4rnRoa832eqvo7 0SImcFaPohXKM1nUH0vqvLTGPBA74TUnaW0Bs12M9 X-Google-Smtp-Source: AGHT+IHcc/JG+uTGnQtFSgqqXOn0wsPBdUiuhxBMy6uhDGm0ku6AaT9r8bwjhg6KD/eV0wxdvQB7PQ== X-Received: by 2002:a05:6512:281d:b0:53e:3a7c:c0b5 with SMTP id 2adb3069b0e04-5494c129c16mr4587768e87.10.1741077370336; Tue, 04 Mar 2025 00:36:10 -0800 (PST) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5495f4c36d1sm770696e87.52.2025.03.04.00.36.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Mar 2025 00:36:09 -0800 (PST) Date: Tue, 4 Mar 2025 10:36:08 +0200 (EET) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: Krzysztof Pyrkosz via ffmpeg-devel In-Reply-To: <20250303213254.14193-2-ffmpeg@szaka.eu> Message-ID: References: <20250303213254.14193-2-ffmpeg@szaka.eu> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/aarch64/vvc: Optimize NEON version of vvc_dmvr X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Krzysztof Pyrkosz Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Mon, 3 Mar 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote: > This patch replaces blocks of instructions performing rounding and > widening shifts with one-liners achieving the same result. > > Before and after on A78 > dmvr_8_12x20_neon: 86.2 ( 6.90x) > dmvr_8_20x12_neon: 94.8 ( 5.93x) > dmvr_8_20x20_neon: 141.5 ( 6.50x) > dmvr_12_12x20_neon: 158.0 ( 3.76x) > dmvr_12_20x12_neon: 151.2 ( 3.73x) > dmvr_12_20x20_neon: 247.2 ( 3.71x) > dmvr_hv_8_12x20_neon: 423.2 ( 3.75x) > dmvr_hv_8_20x12_neon: 434.0 ( 3.69x) > dmvr_hv_8_20x20_neon: 706.0 ( 3.69x) > > dmvr_8_12x20_neon: 77.2 ( 7.70x) > dmvr_8_20x12_neon: 66.5 ( 8.49x) > dmvr_8_20x20_neon: 92.2 ( 9.90x) > dmvr_12_12x20_neon: 80.2 ( 7.38x) > dmvr_12_20x12_neon: 58.2 ( 9.59x) > dmvr_12_20x20_neon: 90.0 (10.15x) > dmvr_hv_8_12x20_neon: 369.0 ( 4.34x) > dmvr_hv_8_20x12_neon: 355.8 ( 4.49x) > dmvr_hv_8_20x20_neon: 574.2 ( 4.51x) > --- > libavcodec/aarch64/vvc/inter.S | 72 ++++++++++------------------------ > 1 file changed, 20 insertions(+), 52 deletions(-) LGTM, pushed. // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".