From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 0706B49F9A for ; Wed, 17 Apr 2024 08:20:05 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7DE9168D2E2; Wed, 17 Apr 2024 11:20:02 +0300 (EEST) Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8DED568CF50 for ; Wed, 17 Apr 2024 11:19:55 +0300 (EEST) Received: by mail-lf1-f46.google.com with SMTP id 2adb3069b0e04-518c9ff3e29so3963098e87.0 for ; Wed, 17 Apr 2024 01:19:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1713341995; x=1713946795; darn=ffmpeg.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=I2yuI/wwuyGJ3CoybuEpvnvuJRr2jwvkB7TWiBzt2mU=; b=VYJJf7okA4T9yQappVJUeNsAfftGbHHOGLL9MZJSwsoEJ+d+mbkurXlvgrnns9EBVR QoK+ZuO++rkqwXwX4b5mRnnOmvxNyVK6TyW2ECi71r0KAxnEUsKUyxrbLQZciM87YgEn cqaf6VI2nWkFHzqyJ8/H1ans7hF2blZyiwIArEG+i18a0QQlh6tEmJ2T4VZ0/uRRfKVi vlmyTAL4mh10MqGjlnCxJZe/FKUIfoj4r0ITpQpsWCUx7REVQ/MrGsvr9EdF2sn0qOlN P+3DGeL3/Q6hlPytE807d5YCIdv8eD1Z9tX1qF3oU9DuLcHsqjnD1+aa1423VBDvbJMZ RDKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713341995; x=1713946795; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=I2yuI/wwuyGJ3CoybuEpvnvuJRr2jwvkB7TWiBzt2mU=; b=N/3HgNhvdMd2H732hF6LnRmjSubMhxA01WkgFLJsx//crt2oUOmRdquAsuAj+vBSS2 fFO+E38BtBbyPhyIROdIBh7bh5Qb04FpEITiSwT9hVmzq/5b8GsGnKDUB3xHvHB4cSRy cJOwYEpdt46QYYNHlqRdZFlRBIm4ySZPh73s+qVWK9Dp5/JVWNXpsbuqTvGO6u2HWnpr x8cnBucmqrovdUZ9eMO3IpOSmO50E25/fiUXR/4w58nWsL2iS7i49BrqPVl+71ZiCfmA b61xq4fHfuhKJAUi+LXQwDqa6LIs0JvkyENwtXQrqe8iNmMcGwAemiTVQe241uMt49K9 QXhA== X-Gm-Message-State: AOJu0YwLaR2z0xQB6gwLaFc2WhVSRg8jj9t+i7PQ31/dflaNrVyLNtet BZpx225pqGHSZ+tJGLCmElTn9pJ8kRX4qUGoXFOac+AWDuTtXr9ivFl/Qr+YlFGH5bRMrzGMFje ffA== X-Google-Smtp-Source: AGHT+IE2HXY4GN2JebYwHFBdHtoyfSZAo+v08krzOs74Gp1WBJDGXYzwBSz8BvYK2QoJXag6UnY4zA== X-Received: by 2002:a05:6512:224e:b0:518:87ba:c846 with SMTP id i14-20020a056512224e00b0051887bac846mr12207601lfu.31.1713341994231; Wed, 17 Apr 2024 01:19:54 -0700 (PDT) Received: from tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net (tunnel335574-pt.tunnel.tserv24.sto1.ipv6.he.net. [2001:470:27:11::2]) by smtp.gmail.com with ESMTPSA id q6-20020ac25106000000b00516d282df53sm1871886lfb.253.2024.04.17.01.19.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Apr 2024 01:19:53 -0700 (PDT) Date: Wed, 17 Apr 2024 11:19:49 +0300 (EEST) From: =?ISO-8859-15?Q?Martin_Storsj=F6?= To: FFmpeg development discussions and patches In-Reply-To: <20240416225610.104477-1-ramiro.polla@gmail.com> Message-ID: <1cc9a433-dba2-a02a-87f9-74ace1f1b9b0@martin.st> References: <20240416225610.104477-1-ramiro.polla@gmail.com> MIME-Version: 1.0 Subject: Re: [FFmpeg-devel] [PATCH v2] lavc/aarch64/fdct: add neon-optimized fdct for aarch64 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Ramiro Polla Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: On Wed, 17 Apr 2024, Ramiro Polla wrote: > The code is imported from libjpeg-turbo-3.0.1. The neon registers used > have been changed to avoid modifying v8-v15. > --- > libavcodec/aarch64/Makefile | 2 + > libavcodec/aarch64/fdct.h | 26 ++ > libavcodec/aarch64/fdctdsp_init_aarch64.c | 39 +++ > libavcodec/aarch64/fdctdsp_neon.S | 368 ++++++++++++++++++++++ > libavcodec/avcodec.h | 1 + > libavcodec/fdctdsp.c | 4 +- > libavcodec/fdctdsp.h | 2 + > libavcodec/options_table.h | 1 + > libavcodec/tests/aarch64/dct.c | 2 + > tests/checkasm/Makefile | 1 + > tests/checkasm/checkasm.c | 3 + > tests/checkasm/checkasm.h | 1 + > tests/checkasm/fdctdsp.c | 68 ++++ > tests/fate/checkasm.mak | 1 + > 14 files changed, 518 insertions(+), 1 deletion(-) > create mode 100644 libavcodec/aarch64/fdct.h > create mode 100644 libavcodec/aarch64/fdctdsp_init_aarch64.c > create mode 100644 libavcodec/aarch64/fdctdsp_neon.S > create mode 100644 tests/checkasm/fdctdsp.c Overall LGTM, thanks! You may wish to split adding the checkasm test to a separate patch, before adding the new implementation. I was surprised by the header libavcodec/aarch64/fdct.h which seemed redundant on first glance, but I see that this is needed for the dct test executable in libavcodec/tests/aarch64/dct.c, so I guess this is reasonable. (In most other asm implementations, we just declare the functions at the start of the *_init.c files.) // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".