Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: Krzysztof Pyrkosz via ffmpeg-devel <ffmpeg-devel@ffmpeg.org>
Cc: Krzysztof Pyrkosz <ffmpeg@szaka.eu>
Subject: Re: [FFmpeg-devel] [PATCH] avutil/aarch64/float_dsp_neon: Refactor ff_vector_fmac_scalar_neon
Date: Sun, 19 Jan 2025 22:41:22 +0200 (EET)
Message-ID: <a11d158b-26da-d0f6-2dd4-8a17f057275@martin.st> (raw)
In-Reply-To: <20250119150440.26868-2-ffmpeg@szaka.eu>

On Sun, 19 Jan 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote:

> ---
> libavutil/aarch64/float_dsp_neon.S | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/libavutil/aarch64/float_dsp_neon.S b/libavutil/aarch64/float_dsp_neon.S
> index 35e2715b87..b21f34c084 100644
> --- a/libavutil/aarch64/float_dsp_neon.S
> +++ b/libavutil/aarch64/float_dsp_neon.S
> @@ -40,18 +40,17 @@ function ff_vector_fmul_neon, export=1
> endfunc
>
> function ff_vector_fmac_scalar_neon, export=1
> -        mov             x3,  #-32
> 1:      subs            w2,  w2,  #16
> -        ld1             {v16.4s, v17.4s}, [x0], #32
> -        ld1             {v18.4s, v19.4s}, [x0], x3
> -        ld1             {v4.4s,  v5.4s},  [x1], #32
> -        ld1             {v6.4s,  v7.4s},  [x1], #32
> +        ldp             q16, q17, [x0]
> +        ldp             q4, q5, [x1], #32
>         fmla            v16.4s, v4.4s,  v0.s[0]
>         fmla            v17.4s, v5.4s,  v0.s[0]
> +        stp             q16, q17, [x0], #32
> +        ldp             q18, q19, [x0]
> +        ldp             q6, q7, [x1], #32
>         fmla            v18.4s, v6.4s,  v0.s[0]
>         fmla            v19.4s, v7.4s,  v0.s[0]
> -        st1             {v16.4s, v17.4s}, [x0], #32
> -        st1             {v18.4s, v19.4s}, [x0], #32
> +        stp             q18, q19, [x0], #32
>         b.ne            1b
>         ret
> endfunc
> --

The change looks ok to me, but for changes like this, please do quote 
actual checkasm benchmark numbers, with the actual numbers before/after 
(not just the change in speedup factor), and include such numbers in the 
commit message for reference. And include info about what core you have 
benchmarked it on.


The reason for that is that various tunings of assembly like this can have 
a lot of effect on one core, and little/no effect on others, or even 
contradictory effects on others. So in order not to go back and forth (if 
someone else comes along next week and does a different tuning for a 
different core), please include info about the actual numbers and what you 
have benchmarked on.


I benchmarked this change, on a handful of different cores, and got the 
following numbers:

Before:               Cortex A53    A72     A73    A78
vector_fmac_scalar_neon:   606.0  213.8   321.8  115.5
After:
vector_fmac_scalar_neon:   608.0  198.2   303.2   90.8

So it shows no change (but no regression either) on A53, and a 5-27% 
speedup on in-order cores. So it looks like a reasonable tradeoff.


However, to show the contrast, if one were to favour optimizations for the 
A53, one could do this simple change:

--- a/libavutil/aarch64/float_dsp_neon.S
+++ b/libavutil/aarch64/float_dsp_neon.S
@@ -43,8 +43,8 @@ function ff_vector_fmac_scalar_neon, export=1
          mov             x3,  #-32
  1:      subs            w2,  w2,  #16
          ld1             {v16.4s, v17.4s}, [x0], #32
-        ld1             {v18.4s, v19.4s}, [x0], x3
          ld1             {v4.4s,  v5.4s},  [x1], #32
+        ld1             {v18.4s, v19.4s}, [x0], x3
          ld1             {v6.4s,  v7.4s},  [x1], #32
          fmla            v16.4s, v4.4s,  v0.s[0]
          fmla            v17.4s, v5.4s,  v0.s[0]

This gives the following numbers:

Before:               Cortex A53    A72     A73    A78
vector_fmac_scalar_neon:   606.0  213.8   321.8  115.5
After:
vector_fmac_scalar_neon:   572.0  213.8   308.0  115.2

I.e. such a change would be almost as good for the A73 and even better for 
A53, while having no effect on A72 and A78.

In this case, the change probably is fine and probably a good compromise. 
But I wanted to show this example to highlight the fact that tunings of 
functions can be very subjective.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

      parent reply	other threads:[~2025-01-19 20:41 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-19 15:04 [FFmpeg-devel] " Krzysztof Pyrkosz via ffmpeg-devel
2025-01-19 15:04 ` [FFmpeg-devel] [PATCH] " Krzysztof Pyrkosz via ffmpeg-devel
2025-01-19 16:04   ` Lynne
2025-01-19 20:41   ` Martin Storsjö [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a11d158b-26da-d0f6-2dd4-8a17f057275@martin.st \
    --to=martin@martin.st \
    --cc=ffmpeg-devel@ffmpeg.org \
    --cc=ffmpeg@szaka.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git