From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id B2CE945B13 for ; Thu, 16 Mar 2023 21:56:21 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8FD3A68C108; Thu, 16 Mar 2023 23:56:18 +0200 (EET) Received: from relay11.mail.gandi.net (relay11.mail.gandi.net [217.70.178.231]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 75A9768C0B0 for ; Thu, 16 Mar 2023 23:56:11 +0200 (EET) Received: (Authenticated sender: michael@niedermayer.cc) by mail.gandi.net (Postfix) with ESMTPSA id 8CCEA100003 for ; Thu, 16 Mar 2023 21:56:10 +0000 (UTC) Date: Thu, 16 Mar 2023 22:56:09 +0100 From: Michael Niedermayer To: FFmpeg development discussions and patches Message-ID: <20230316215609.GG375355@pb2> References: <20230307090806.2003-1-zhujunxian@oss.cipunited.com> <20230307204527.GF1928637@pb2> MIME-Version: 1.0 In-Reply-To: Subject: Re: [FFmpeg-devel] [PATCH v3] avcodec/mathops: Optimize generic mid_pred function X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: multipart/mixed; boundary="===============2290868785231876142==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============2290868785231876142== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="kjpMrWxdCilgNbo1" Content-Disposition: inline --kjpMrWxdCilgNbo1 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 15, 2023 at 06:09:13PM +0800, YunQiang Su wrote: > Michael Niedermayer =E4=BA=8E2023=E5=B9=B43=E6= =9C=888=E6=97=A5=E5=91=A8=E4=B8=89 04:45=E5=86=99=E9=81=93=EF=BC=9A > > > > On Tue, Mar 07, 2023 at 05:08:27PM +0800, Junxian Zhu wrote: > > > From: Junxian Zhu > > > > > > Rewrite mid_pred function in generic mathops.h, reduce branch jump to= improve performance. And because nowadays new version compiler can compile= enough short asmbbely code as handwritting in these function, so remove sp= ecified optimized mips inline asmbbely mathops.h. > > > > as you write, that it improves performance > > what speed effect does this have exactly? > > thx > > >=20 > I tested the performance, using this code [...] > On MacOS 13.2 with Apple M1: > The old code the new code > 2.1s 2.3s >=20 > On Cavium ThunderX / arm64 (GCC 10.2.1 -O3) > The old code the new code > 52.7s 37.8s >=20 > On Loongson 3A4000/mips64el (GCC 10.2.1 -O3) > The old code the new code > 90s 5s >=20 > On Intel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz (GCC 10.2.1 -O3) > The old code the new code > 14.4s 15.4s >=20 > On SF19A2890/MIPS interAptiv (GCC 10.2.1 -O3) > The old code the new code > 314s 39.3s >=20 > On Intel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz (GCC 12.2.0 -O3) > The old code the new code > 14.4s 8.8s >=20 > On sifive,bullet0/rv64imafdc (GCC 12.2.0 -O3, 1e6 times instead of 1e7) > The old code the new code > 11.9s 15.2s >=20 > On Freescale i.MX53/ARMv7 Processor rev 5 (v7l) (GCC 12.2.0 -O3, 1e6 > times instead of 1e7) > The old code the new code > 24.1s 15.7s >=20 > On POWER8 (architected), altivec supported, BIG ENDIAN, ppc64 (GCC 12.2.= 0 -O3) > The old code the new code > 43.1s 50.8s >=20 > On POWER8 (architected), altivec supported, LITTLE ENDIAN, ppc64el > (GCC 12.2.0 -O3) > The old code the new code > 7.8s 4.7s >=20 > On PA8900 (Shortfin) PA-RISC (GCC 12.2.0 -O3 1e6 times instead of 1e7) > The old code the new code > 39.9s 47.2s >=20 > On IBM/S390 aka s390x (GCC 12.2.0 -O3) > The old code the new code > 82.2s 30.8s >=20 > On Intel(R) Itanium(R) Processor 9320 (GCC 12.2.0 -O3) > The old code the new code > 89.5s 78.1s >=20 > Cavium Octeon III V0.2 FPU V0.0 /mipsel (GCC 12.2.0 -O3) > The old code the new code > 117.5s 118.5s These cover a quite extensive set of hw, impressive thx [...] --=20 Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Dictatorship: All citizens are under surveillance, all their steps and actions recorded, for the politicians to enforce control. Democracy: All politicians are under surveillance, all their steps and actions recorded, for the citizens to enforce control. --kjpMrWxdCilgNbo1 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABEIAB0WIQSf8hKLFH72cwut8TNhHseHBAsPqwUCZBOQdAAKCRBhHseHBAsP qwTGAJ9j/Z15uim3SBOjpEl08X9f6am+eACfWYyXZxJEiN1VpURtaWjFDtexE+M= =jB5G -----END PGP SIGNATURE----- --kjpMrWxdCilgNbo1-- --===============2290868785231876142== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============2290868785231876142==--