Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v3] avcodec/mathops: Optimize generic mid_pred function
Date: Thu, 16 Mar 2023 22:56:09 +0100
Message-ID: <20230316215609.GG375355@pb2> (raw)
In-Reply-To: <CAKcpw6XDYEMrzV0Cg7RKZa5i_UC4GtBN1nU6MuQ8GoL0FX3XGA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3229 bytes --]

On Wed, Mar 15, 2023 at 06:09:13PM +0800, YunQiang Su wrote:
> Michael Niedermayer <michael@niedermayer.cc> 于2023年3月8日周三 04:45写道:
> >
> > On Tue, Mar 07, 2023 at 05:08:27PM +0800, Junxian Zhu wrote:
> > > From: Junxian Zhu <zhujunxian@oss.cipunited.com>
> > >
> > > Rewrite mid_pred function in generic mathops.h, reduce branch jump to improve performance. And because nowadays new version compiler can compile enough short asmbbely code as handwritting in these function, so remove specified optimized mips inline asmbbely mathops.h.
> >
> > as you write, that it improves performance
> > what speed effect does this have exactly?
> > thx
> >
> 
> I tested the performance, using this code
[...]
> On MacOS 13.2 with Apple M1:
> The old code              the new code
> 2.1s                            2.3s
> 
> On Cavium ThunderX / arm64 (GCC 10.2.1 -O3)
> The old code              the new code
> 52.7s                          37.8s
> 
> On Loongson 3A4000/mips64el (GCC 10.2.1 -O3)
> The old code              the new code
> 90s                             5s
> 
> On Intel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz (GCC 10.2.1 -O3)
> The old code              the new code
> 14.4s                          15.4s
> 
> On SF19A2890/MIPS interAptiv (GCC 10.2.1 -O3)
> The old code              the new code
> 314s                           39.3s
> 
> On Intel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz (GCC 12.2.0 -O3)
> The old code              the new code
> 14.4s                          8.8s
> 
> On sifive,bullet0/rv64imafdc  (GCC 12.2.0 -O3, 1e6 times instead of 1e7)
> The old code              the new code
> 11.9s                          15.2s
> 
> On Freescale i.MX53/ARMv7 Processor rev 5 (v7l)  (GCC 12.2.0 -O3, 1e6
> times instead of 1e7)
> The old code              the new code
> 24.1s                          15.7s
> 
> On POWER8 (architected), altivec supported, BIG ENDIAN, ppc64  (GCC 12.2.0 -O3)
> The old code              the new code
> 43.1s                          50.8s
> 
> On POWER8 (architected), altivec supported, LITTLE ENDIAN, ppc64el
> (GCC 12.2.0 -O3)
> The old code              the new code
> 7.8s                            4.7s
> 
> On PA8900 (Shortfin) PA-RISC (GCC 12.2.0 -O3 1e6 times instead of 1e7)
> The old code              the new code
> 39.9s                          47.2s
> 
> On IBM/S390 aka s390x (GCC 12.2.0 -O3)
> The old code              the new code
> 82.2s                          30.8s
> 
> On Intel(R)  Itanium(R)  Processor 9320  (GCC 12.2.0 -O3)
> The old code              the new code
> 89.5s                          78.1s
> 
> Cavium Octeon III V0.2  FPU V0.0 /mipsel  (GCC 12.2.0 -O3)
> The old code              the new code
> 117.5s                        118.5s

These cover a quite extensive set of hw, impressive

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship: All citizens are under surveillance, all their steps and
actions recorded, for the politicians to enforce control.
Democracy: All politicians are under surveillance, all their steps and
actions recorded, for the citizens to enforce control.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2023-03-16 21:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07  9:08 Junxian Zhu
2023-03-07 20:45 ` Michael Niedermayer
2023-03-15 10:09   ` YunQiang Su
2023-03-16 21:56     ` Michael Niedermayer [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-03-06  9:10 Junxian Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230316215609.GG375355@pb2 \
    --to=michael@niedermayer.cc \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git