From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 847614118C for ; Sat, 16 Apr 2022 21:33:07 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A34AF68B2C8; Sun, 17 Apr 2022 00:33:04 +0300 (EEST) Received: from mail8.parnet.fi (mail8.parnet.fi [77.234.108.134]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9C7E568AEEF for ; Sun, 17 Apr 2022 00:32:58 +0300 (EEST) Received: from mail9.parnet.fi (mail9.parnet.fi [77.234.108.21]) by mail8.parnet.fi with ESMTP id 23GLWkVP023538-23GLWkVQ023538; Sun, 17 Apr 2022 00:32:46 +0300 Received: from foo.martin.st (host-97-187.parnet.fi [77.234.97.187]) by mail9.parnet.fi (Postfix) with ESMTPS id 8F646A1431; Sun, 17 Apr 2022 00:32:46 +0300 (EEST) Date: Sun, 17 Apr 2022 00:32:46 +0300 (EEST) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: "Swinney, Jonathan" In-Reply-To: Message-ID: <8a759751-5c7f-e5e9-b03c-90d655caba8c@martin.st> References: MIME-Version: 1.0 X-FE-Policy-ID: 3:14:2:SYSTEM Subject: Re: [FFmpeg-devel] [PATCH 2/2] swscale/aarch64: add vscale specializations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: "Pop, Sebastian" , "ffmpeg-devel@ffmpeg.org" Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Fri, 15 Apr 2022, Swinney, Jonathan wrote: > This commit adds new code paths for vscale when filterSize is 2, 4, or 8. By > using specialized code with unrolling to match the filterSize we can improve > performance. > > | (seconds) | c6g | | | > | ------------| ----- | ----- | ----- | > | filterSize | 2 | 4 | 8 | > | original | 0.581 | 0.974 | 1.744 | > | optimized | 0.399 | 0.569 | 1.052 | > | improvement | 31.1% | 41.6% | 39.7% | > > Signed-off-by: Jonathan Swinney > --- > libswscale/aarch64/output.S | 147 +++++++++++++++++++++++++++++++++-- > libswscale/aarch64/swscale.c | 12 +++ > 2 files changed, 153 insertions(+), 6 deletions(-) I'll have a closer look at the assembly itself at a later time, but first: The checkasm tests in tests/checkasm/sw_scale.c does test yuv2planeX, but there's no testing of yuv2plane1, can you extend it to cover that too? And that existing test only tests filter sizes 1, 4, 8, 16, but apparently should be extended to test size 2 too? // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".