From: Lynne <dev@lynne.ee>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2 06/15] avfilter/vf_bwdif: Add clip and spatial macros for aarch64 neon
Date: Sun, 2 Jul 2023 16:02:22 +0200 (CEST)
Message-ID: <NZLuLSf--3-9@lynne.ee> (raw)
In-Reply-To: <20230702123242.232484-7-jc@kynesim.co.uk>
Jul 2, 2023, 14:34 by jc@kynesim.co.uk:
> Signed-off-by: John Cox <jc@kynesim.co.uk>
> ---
> libavfilter/aarch64/vf_bwdif_neon.S | 73 +++++++++++++++++++++++++++++
> 1 file changed, 73 insertions(+)
>
> diff --git a/libavfilter/aarch64/vf_bwdif_neon.S b/libavfilter/aarch64/vf_bwdif_neon.S
> index 6a614f8d6e..48dc7bcd9d 100644
> --- a/libavfilter/aarch64/vf_bwdif_neon.S
> +++ b/libavfilter/aarch64/vf_bwdif_neon.S
> @@ -66,6 +66,79 @@
> umlsl2 \a3\().4s, \s1\().8h, \k
> .endm
>
> +// int b = m2s1 - m1;
> +// int f = p2s1 - p1;
> +// int dc = c0s1 - m1;
> +// int de = c0s1 - p1;
> +// int sp_max = FFMIN(p1 - c0s1, m1 - c0s1);
> +// sp_max = FFMIN(sp_max, FFMAX(-b,-f));
> +// int sp_min = FFMIN(c0s1 - p1, c0s1 - m1);
> +// sp_min = FFMIN(sp_min, FFMAX(b,f));
> +// diff = diff == 0 ? 0 : FFMAX3(diff, sp_min, sp_max);
> +.macro SPAT_CHECK diff, m2s1, m1, c0s1, p1, p2s1, t0, t1, t2, t3
> + uqsub \t0\().16b, \p1\().16b, \c0s1\().16b
> + uqsub \t2\().16b, \m1\().16b, \c0s1\().16b
> + umin \t2\().16b, \t0\().16b, \t2\().16b
> +
> + uqsub \t1\().16b, \m1\().16b, \m2s1\().16b
> + uqsub \t3\().16b, \p1\().16b, \p2s1\().16b
> + umax \t3\().16b, \t3\().16b, \t1\().16b
> + umin \t3\().16b, \t3\().16b, \t2\().16b
> +
> + uqsub \t0\().16b, \c0s1\().16b, \p1\().16b
> + uqsub \t2\().16b, \c0s1\().16b, \m1\().16b
> + umin \t2\().16b, \t0\().16b, \t2\().16b
> +
> + uqsub \t1\().16b, \m2s1\().16b, \m1\().16b
> + uqsub \t0\().16b, \p2s1\().16b, \p1\().16b
> + umax \t0\().16b, \t0\().16b, \t1\().16b
> + umin \t2\().16b, \t2\().16b, \t0\().16b
> +
> + cmeq \t1\().16b, \diff\().16b, #0
> + umax \diff\().16b, \diff\().16b, \t3\().16b
> + umax \diff\().16b, \diff\().16b, \t2\().16b
> + bic \diff\().16b, \diff\().16b, \t1\().16b
> +.endm
> +
> +// i0 = s0;
> +// if (i0 > d0 + diff0)
> +// i0 = d0 + diff0;
> +// else if (i0 < d0 - diff0)
> +// i0 = d0 - diff0;
> +//
> +// i0 = s0 is safe
> +.macro DIFF_CLIP i0, s0, d0, diff, t0, t1
> + uqadd \t0\().16b, \d0\().16b, \diff\().16b
> + uqsub \t1\().16b, \d0\().16b, \diff\().16b
> + umin \i0\().16b, \s0\().16b, \t0\().16b
> + umax \i0\().16b, \i0\().16b, \t1\().16b
> +.endm
> +
> +// i0 = FFABS(m1 - p1) > td0 ? i1 : i2;
> +// DIFF_CLIP
> +//
> +// i0 = i1 is safe
> +.macro INTERPOL i0, i1, i2, m1, d0, p1, td0, diff, t0, t1, t2
> + uabd \t0\().16b, \m1\().16b, \p1\().16b
> + cmhi \t0\().16b, \t0\().16b, \td0\().16b
> + bsl \t0\().16b, \i1\().16b, \i2\().16b
> + DIFF_CLIP \i0, \t0, \d0, \diff, \t1, \t2
> +.endm
> +
> +.macro PUSH_VREGS
> + stp d8, d9, [sp, #-64]!
> + stp d10, d11, [sp, #16]
> + stp d12, d13, [sp, #32]
> + stp d14, d15, [sp, #48]
> +.endm
> +
> +.macro POP_VREGS
> + ldp d14, d15, [sp, #48]
> + ldp d12, d13, [sp, #32]
> + ldp d10, d11, [sp, #16]
> + ldp d8, d9, [sp], #64
> +.endm
>
Could you squash? Adding empty files and then commit by
commit filling them up is pointless and makes it harder to
review. Just export what you need in one commit, and add
everything else in another.
Also, keep in mind the final spatial clip at the end should be
removable. I discovered it makes the filter look quite a lot
better. Currently, only the Vulkan version does it, but we're
looking into changing the C/asm versions too, and you're the
second one to rush into implementing asm for it before we've
had a chance to discuss it properly.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2023-07-02 14:02 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-02 12:32 [FFmpeg-devel] [PATCH v2 00/15] avfilter/vf_bwdif: Add aarch64 neon functions John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 01/15] avfilter/vf_bwdif: Add outline for aarch " John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 02/15] avfilter/vf_bwdif: Add common macros and consts for aarch64 neon John Cox
2023-07-02 21:04 ` Martin Storsjö
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 03/15] avfilter/vf_bwdif: Export C filter_intra John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 04/15] avfilter/vf_bwdif: Add neon for filter_intra John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 05/15] tests/checkasm: Add test for vf_bwdif filter_intra John Cox
2023-07-02 21:14 ` Martin Storsjö
2023-07-04 10:18 ` John Cox
2023-07-04 10:25 ` Kieran Kunhya
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 06/15] avfilter/vf_bwdif: Add clip and spatial macros for aarch64 neon John Cox
2023-07-02 14:02 ` Lynne [this message]
2023-07-02 14:09 ` Kieran Kunhya
2023-07-02 16:55 ` Lynne
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 07/15] avfilter/vf_bwdif: Export C filter_edge John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 08/15] avfilter/vf_bwdif: Add neon for filter_edge John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 09/15] tests/checkasm: Add test for vf_bwdif filter_edge John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 10/15] avfilter/vf_bwdif: Export C filter_line John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 11/15] avfilter/vf_bwdif: Add neon for filter_line John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 12/15] avfilter/vf_bwdif: Add a filter_line3 method for optimisation John Cox
2023-07-02 17:26 ` Thomas Mundt
2023-07-02 21:12 ` Martin Storsjö
2023-07-03 8:27 ` John Cox
2023-07-03 22:16 ` Thomas Mundt
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 13/15] avfilter/vf_bwdif: Add neon for filter_line3 John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 14/15] tests/checkasm: Add test for vf_bwdif filter_line3 John Cox
2023-07-02 12:32 ` [FFmpeg-devel] [PATCH v2 15/15] avfilter/vf_bwdif: Block filter slices into a multiple of 4 lines John Cox
2023-07-02 21:09 ` [FFmpeg-devel] [PATCH v2 00/15] avfilter/vf_bwdif: Add aarch64 neon functions Martin Storsjö
2023-07-03 8:44 ` John Cox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=NZLuLSf--3-9@lynne.ee \
--to=dev@lynne.ee \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git