Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Michael Niedermayer <michael@niedermayer.cc>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2 1/1] lavc/aarch64: add some neon pix_abs functions
Date: Fri, 15 Apr 2022 18:43:48 +0200
Message-ID: <20220415164348.GN2829255@pb2> (raw)
In-Reply-To: <50530740b25747fbbfd138adabdc4a8f@EX13D07UWB004.ant.amazon.com>


[-- Attachment #1.1: Type: text/plain, Size: 3147 bytes --]

On Thu, Apr 14, 2022 at 04:22:58PM +0000, Swinney, Jonathan wrote:
>  - ff_pix_abs16_neon
>  - ff_pix_abs16_xy2_neon
> 
> In direct micro benchmarks of these ff functions verses their C implementations,
> these functions performed as follows on AWS Graviton 2:
> 
> ff_pix_abs16_neon:
> c:  benchmark ran 100000 iterations in 0.955383 seconds
> ff: benchmark ran 100000 iterations in 0.097669 seconds
> 
> ff_pix_abs16_xy2_neon:
> c:  benchmark ran 100000 iterations in 1.916759 seconds
> ff: benchmark ran 100000 iterations in 0.370729 seconds
> 
> Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
> ---
>  libavcodec/aarch64/Makefile              |   2 +
>  libavcodec/aarch64/me_cmp_init_aarch64.c |  39 +++++
>  libavcodec/aarch64/me_cmp_neon.S         | 209 +++++++++++++++++++++++
>  libavcodec/me_cmp.c                      |   2 +
>  libavcodec/me_cmp.h                      |   1 +
>  libavcodec/x86/me_cmp.asm                |   7 +
>  libavcodec/x86/me_cmp_init.c             |   3 +
>  tests/checkasm/Makefile                  |   2 +-
>  tests/checkasm/checkasm.c                |   1 +
>  tests/checkasm/checkasm.h                |   1 +
>  tests/checkasm/motion.c                  | 155 +++++++++++++++++
>  11 files changed, 421 insertions(+), 1 deletion(-)
>  create mode 100644 libavcodec/aarch64/me_cmp_init_aarch64.c
>  create mode 100644 libavcodec/aarch64/me_cmp_neon.S
>  create mode 100644 tests/checkasm/motion.c
> 
[...]
> diff --git a/libavcodec/x86/me_cmp.asm b/libavcodec/x86/me_cmp.asm
> index ad06d485ab..f73b9f9161 100644
> --- a/libavcodec/x86/me_cmp.asm
> +++ b/libavcodec/x86/me_cmp.asm
> @@ -255,6 +255,7 @@ hadamard8x8_diff %+ SUFFIX:
>  
>      HSUM                         m0, m1, eax
>      and                         rax, 0xFFFF
> +    emms
>      ret
>  
>  hadamard8_16_wrapper 0, 14
> @@ -345,6 +346,7 @@ cglobal sse%1, 5,5,8, v, pix1, pix2, lsize, h
>  
>      HADDD     m7, m1
>      movd     eax, m7         ; return value
> +    emms
>      RET
>  %endmacro

on which arm chip did you test this ?


[...]
> diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c
> index 9af911bb88..b330868a38 100644
> --- a/libavcodec/x86/me_cmp_init.c
> +++ b/libavcodec/x86/me_cmp_init.c
> @@ -186,6 +186,8 @@ static int vsad_intra16_mmx(MpegEncContext *v, uint8_t *pix, uint8_t *dummy,
>          : "r" (stride), "m" (h)
>          : "%ecx");
>  
> +    emms_c();
> +
>      return tmp & 0xFFFF;
>  }
>  #undef SUM
> @@ -418,6 +420,7 @@ static inline int sum_mmx(void)
>          "paddw %%mm0, %%mm6             \n\t"
>          "movd %%mm6, %0                 \n\t"
>          : "=r" (ret));
> +    emms_c();
>      return ret & 0xFFFF;
>  }

hmmm

Also before the patch 
checkasm: all 6153 tests passed
after it
checkasm: all 3198 tests passed

thats on a x86-64

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2022-04-15 16:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-14 16:22 Swinney, Jonathan
2022-04-15 16:43 ` Michael Niedermayer [this message]
2022-04-25 22:43   ` Swinney, Jonathan
2022-04-15 21:13 ` Martin Storsjö

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220415164348.GN2829255@pb2 \
    --to=michael@niedermayer.cc \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git