Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: John Cox <jc@kynesim.co.uk>
Cc: thomas.mundt@hr.de,
	FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 08/15] avfilter/vf_bwdif: Add neon for filter_edge
Date: Sun, 2 Jul 2023 23:36:36 +0300 (EEST)
Message-ID: <2b253d35-7d9-10a3-33c8-5558a3691b73@martin.st> (raw)
In-Reply-To: <u6l2ai9dmlguiki6lnha1kq9rllutvvlv5@4ax.com>

On Sun, 2 Jul 2023, John Cox wrote:

> On Sun, 2 Jul 2023 00:40:09 +0300 (EEST), you wrote:
>
>> On Thu, 29 Jun 2023, John Cox wrote:
>>
>>> Signed-off-by: John Cox <jc@kynesim.co.uk>
>>> ---
>>> libavfilter/aarch64/vf_bwdif_init_aarch64.c |  20 ++++
>>> libavfilter/aarch64/vf_bwdif_neon.S         | 104 ++++++++++++++++++++
>>> 2 files changed, 124 insertions(+)
>>>
>>> diff --git a/libavfilter/aarch64/vf_bwdif_init_aarch64.c b/libavfilter/aarch64/vf_bwdif_init_aarch64.c
>>> index 3ffaa07ab3..e75cf2f204 100644
>>> --- a/libavfilter/aarch64/vf_bwdif_init_aarch64.c
>>> +++ b/libavfilter/aarch64/vf_bwdif_init_aarch64.c
>>> @@ -24,10 +24,29 @@
>>> #include "libavfilter/bwdif.h"
>>> #include "libavutil/aarch64/cpu.h"
>>>
>>> +void ff_bwdif_filter_edge_neon(void *dst1, void *prev1, void *cur1, void *next1,
>>> +                               int w, int prefs, int mrefs, int prefs2, int mrefs2,
>>> +                               int parity, int clip_max, int spat);
>>> +
>>> void ff_bwdif_filter_intra_neon(void *dst1, void *cur1, int w, int prefs, int mrefs,
>>>                                 int prefs3, int mrefs3, int parity, int clip_max);
>>>
>>>
>>> +static void filter_edge_helper(void *dst1, void *prev1, void *cur1, void *next1,
>>> +                               int w, int prefs, int mrefs, int prefs2, int mrefs2,
>>> +                               int parity, int clip_max, int spat)
>>> +{
>>> +    const int w0 = clip_max != 255 ? 0 : w & ~15;
>>> +
>>> +    ff_bwdif_filter_edge_neon(dst1, prev1, cur1, next1, w0, prefs, mrefs, prefs2, mrefs2,
>>> +                              parity, clip_max, spat);
>>> +
>>> +    if (w0 < w)
>>> +        ff_bwdif_filter_edge_c((char *)dst1 + w0, (char *)prev1 + w0, (char *)cur1 + w0, (char *)next1 + w0,
>>> +                               w - w0, prefs, mrefs, prefs2, mrefs2,
>>> +                               parity, clip_max, spat);
>>> +}
>>> +
>>> static void filter_intra_helper(void *dst1, void *cur1, int w, int prefs, int mrefs,
>>>                                 int prefs3, int mrefs3, int parity, int clip_max)
>>> {
>>> @@ -52,5 +71,6 @@ ff_bwdif_init_aarch64(BWDIFContext *s, int bit_depth)
>>>         return;
>>>
>>>     s->filter_intra = filter_intra_helper;
>>> +    s->filter_edge  = filter_edge_helper;
>>> }
>>>
>>> diff --git a/libavfilter/aarch64/vf_bwdif_neon.S b/libavfilter/aarch64/vf_bwdif_neon.S
>>> index 6c5d1598f4..a33b235882 100644
>>> --- a/libavfilter/aarch64/vf_bwdif_neon.S
>>> +++ b/libavfilter/aarch64/vf_bwdif_neon.S
>>> @@ -128,6 +128,110 @@ coeffs:
>>>         .hword          5570, 3801, 1016, -3801         // hf[0] = v0.h[2], -hf[1] = v0.h[5]
>>>         .hword          5077, 981                       // sp[0] = v0.h[6]
>>>
>>> +// ============================================================================
>>> +//
>>> +// void ff_bwdif_filter_edge_neon(
>>> +//      void *dst1,     // x0
>>> +//      void *prev1,    // x1
>>> +//      void *cur1,     // x2
>>> +//      void *next1,    // x3
>>> +//      int w,          // w4
>>> +//      int prefs,      // w5
>>> +//      int mrefs,      // w6
>>> +//      int prefs2,     // w7
>>> +//      int mrefs2,     // [sp, #0]
>>> +//      int parity,     // [sp, #8]
>>> +//      int clip_max,   // [sp, #16]  unused
>>> +//      int spat);      // [sp, #24]
>>
>> This doesn't hold for macOS targets (and the checkasm tests fail on that
>> platform).
>>
>> On macOS, arguments that aren't passed in registers but on the stack, are
>> tightly packed. So since parity is 32 bit and mrefs2 also was 32 bit,
>> parity is available at [sp, #4].
>>
>> Therefore, it's usually simplest for portability reasons, to pass any
>> arguments after the first 8, as intptr_t or ptrdiff_t, as that makes them
>> consistent across platforms.
>
> Not my interface - this is already existing code. What do you suggest I
> do?
>
> I'm happy either to change the interface or fix my stack offsets if
> there is any clue that lets me detect this ABI. As personal preference
> I'd choose the latter.

You can litter the code with "#ifdef __APPLE__", but in general I'd prefer 
to not go down that route as it makes the code messier to maintain, and 
explicitly requires you to test the code on both system combinations 
whenever touching it, compared to just having to test it once if there's a 
shared codepath.

We often do change function interfaces to make writing the assembly 
simpler; we often change parameters like "int stride" into "ptrdiff_t 
stride", to avoid needing explicit sign extension instructions on some 
architectures. Changing interfaces for this reason is less common though 
(we often don't have many functions that take that many parameters), but 
it shouldn't require a lot of changes for other architectures with 
existing assembly.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2023-07-02 20:36 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-29 17:57 [FFmpeg-devel] [PATCH 00/15] avfilter/vf_bwdif: Add aarch64 neon functions John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 01/15] avfilter/vf_bwdif: Add outline for aarch " John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 02/15] avfilter/vf_bwdif: Add common macros and consts for aarch64 neon John Cox
2023-07-01 21:35   ` Martin Storsjö
2023-07-02 10:27     ` John Cox
2023-07-02 20:07       ` Martin Storsjö
2023-07-02 21:02         ` Martin Storsjö
2023-07-03  8:31           ` John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 03/15] avfilter/vf_bwdif: Export C filter_intra John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 04/15] avfilter/vf_bwdif: Add neon for filter_intra John Cox
2023-07-01 21:37   ` Martin Storsjö
2023-07-02 10:43     ` John Cox
2023-07-02 20:18       ` Martin Storsjö
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 05/15] tests/checkasm: Add test for vf_bwdif filter_intra John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 06/15] avfilter/vf_bwdif: Add clip and spatial macros for aarch64 neon John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 07/15] avfilter/vf_bwdif: Export C filter_edge John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 08/15] avfilter/vf_bwdif: Add neon for filter_edge John Cox
2023-07-01 21:40   ` Martin Storsjö
2023-07-02 10:50     ` John Cox
2023-07-02 20:36       ` Martin Storsjö [this message]
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 09/15] tests/checkasm: Add test for vf_bwdif filter_edge John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 10/15] avfilter/vf_bwdif: Export C filter_line John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 11/15] avfilter/vf_bwdif: Add neon for filter_line John Cox
2023-07-01 21:44   ` Martin Storsjö
2023-07-02 10:57     ` John Cox
2023-07-02 20:40       ` Martin Storsjö
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 12/15] avfilter/vf_bwdif: Add a filter_line3 method for optimisation John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 13/15] avfilter/vf_bwdif: Add neon for filter_line3 John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 14/15] tests/checkasm: Add test for vf_bwdif filter_line3 John Cox
2023-06-29 17:57 ` [FFmpeg-devel] [PATCH 15/15] avfilter/vf_bwdif: Block filter slices into a multiple of 4 lines John Cox
2023-07-01 21:33 ` [FFmpeg-devel] [PATCH 00/15] avfilter/vf_bwdif: Add aarch64 neon functions Martin Storsjö
2023-07-02 10:18   ` John Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b253d35-7d9-10a3-33c8-5558a3691b73@martin.st \
    --to=martin@martin.st \
    --cc=ffmpeg-devel@ffmpeg.org \
    --cc=jc@kynesim.co.uk \
    --cc=thomas.mundt@hr.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git