From: Michael Niedermayer <michael@niedermayer.cc> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH v2 1/1] lavc/aarch64: add some neon pix_abs functions Date: Fri, 15 Apr 2022 18:43:48 +0200 Message-ID: <20220415164348.GN2829255@pb2> (raw) In-Reply-To: <50530740b25747fbbfd138adabdc4a8f@EX13D07UWB004.ant.amazon.com> [-- Attachment #1.1: Type: text/plain, Size: 3147 bytes --] On Thu, Apr 14, 2022 at 04:22:58PM +0000, Swinney, Jonathan wrote: > - ff_pix_abs16_neon > - ff_pix_abs16_xy2_neon > > In direct micro benchmarks of these ff functions verses their C implementations, > these functions performed as follows on AWS Graviton 2: > > ff_pix_abs16_neon: > c: benchmark ran 100000 iterations in 0.955383 seconds > ff: benchmark ran 100000 iterations in 0.097669 seconds > > ff_pix_abs16_xy2_neon: > c: benchmark ran 100000 iterations in 1.916759 seconds > ff: benchmark ran 100000 iterations in 0.370729 seconds > > Signed-off-by: Jonathan Swinney <jswinney@amazon.com> > --- > libavcodec/aarch64/Makefile | 2 + > libavcodec/aarch64/me_cmp_init_aarch64.c | 39 +++++ > libavcodec/aarch64/me_cmp_neon.S | 209 +++++++++++++++++++++++ > libavcodec/me_cmp.c | 2 + > libavcodec/me_cmp.h | 1 + > libavcodec/x86/me_cmp.asm | 7 + > libavcodec/x86/me_cmp_init.c | 3 + > tests/checkasm/Makefile | 2 +- > tests/checkasm/checkasm.c | 1 + > tests/checkasm/checkasm.h | 1 + > tests/checkasm/motion.c | 155 +++++++++++++++++ > 11 files changed, 421 insertions(+), 1 deletion(-) > create mode 100644 libavcodec/aarch64/me_cmp_init_aarch64.c > create mode 100644 libavcodec/aarch64/me_cmp_neon.S > create mode 100644 tests/checkasm/motion.c > [...] > diff --git a/libavcodec/x86/me_cmp.asm b/libavcodec/x86/me_cmp.asm > index ad06d485ab..f73b9f9161 100644 > --- a/libavcodec/x86/me_cmp.asm > +++ b/libavcodec/x86/me_cmp.asm > @@ -255,6 +255,7 @@ hadamard8x8_diff %+ SUFFIX: > > HSUM m0, m1, eax > and rax, 0xFFFF > + emms > ret > > hadamard8_16_wrapper 0, 14 > @@ -345,6 +346,7 @@ cglobal sse%1, 5,5,8, v, pix1, pix2, lsize, h > > HADDD m7, m1 > movd eax, m7 ; return value > + emms > RET > %endmacro on which arm chip did you test this ? [...] > diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c > index 9af911bb88..b330868a38 100644 > --- a/libavcodec/x86/me_cmp_init.c > +++ b/libavcodec/x86/me_cmp_init.c > @@ -186,6 +186,8 @@ static int vsad_intra16_mmx(MpegEncContext *v, uint8_t *pix, uint8_t *dummy, > : "r" (stride), "m" (h) > : "%ecx"); > > + emms_c(); > + > return tmp & 0xFFFF; > } > #undef SUM > @@ -418,6 +420,7 @@ static inline int sum_mmx(void) > "paddw %%mm0, %%mm6 \n\t" > "movd %%mm6, %0 \n\t" > : "=r" (ret)); > + emms_c(); > return ret & 0xFFFF; > } hmmm Also before the patch checkasm: all 6153 tests passed after it checkasm: all 3198 tests passed thats on a x86-64 [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] [-- Attachment #2: Type: text/plain, Size: 251 bytes --] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-04-15 16:44 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-04-14 16:22 Swinney, Jonathan 2022-04-15 16:43 ` Michael Niedermayer [this message] 2022-04-25 22:43 ` Swinney, Jonathan 2022-04-15 21:13 ` Martin Storsjö
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220415164348.GN2829255@pb2 \ --to=michael@niedermayer.cc \ --cc=ffmpeg-devel@ffmpeg.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel This inbox may be cloned and mirrored by anyone: git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \ ffmpegdev@gitmailbox.com public-inbox-index ffmpegdev Example config snippet for mirrors. AGPL code for this site: git clone https://public-inbox.org/public-inbox.git