Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: John Cox <jc@kynesim.co.uk>
Cc: thomas.mundt@hr.de, ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH v4 0/7] avfilter/vf_bwdif: Add aarch64 neon functions
Date: Thu, 6 Jul 2023 00:19:50 +0300 (EEST)
Message-ID: <2fc449f-836b-43b-32c6-7b5c575d5ad9@martin.st> (raw)
In-Reply-To: <20230704140445.240426-1-jc@kynesim.co.uk>

On Tue, 4 Jul 2023, John Cox wrote:

> Also adds a filter_line3 method which on aarch64 neon yields approx 30%
> speedup over 2xfilter_line and a memcpy
>
> Differences from v3:
> Remove a few lines of neon in filter_line that should have been removed
> when copying from line3
>
> Sorry about the two patch sets in quick succession, but I think I've
> applied all the requested changes and I didn't want this mistake in the
> final patchset. (The mistake was benign - it just wasted a few cycles.)
>
> John Cox (7):
>  tests/checkasm: Add test for vf_bwdif filter_intra
>  avfilter/vf_bwdif: Add neon for filter_intra
>  tests/checkasm: Add test for vf_bwdif filter_edge
>  avfilter/vf_bwdif: Add neon for filter_edge
>  avfilter/vf_bwdif: Add neon for filter_line
>  avfilter/vf_bwdif: Add a filter_line3 method for optimisation
>  avfilter/vf_bwdif: Add neon for filter_line3

I think this looks ok to me, so I'll go ahead and push it. The tests pass 
on x86 too, msvc/aarch64, llvm-mingw/aarch64, macOS and linux.

Just a couple notes I didn't remember to mention before:

- Regarding the int parameters on the stack; as long as you do have the C 
wrapper functions, you don't strictly need to have the same function 
signature for the NEON function as for the actual DSP function. So if 
you'd have wanted to have a different signature for the NEON function 
(changing it to intptr_t), that'd worked too. But I do see the benefit of 
keeping it identical to the DSP function interface.

- The way of making the the C function exported and calling that for the 
tail is neat, but kinda unusual within ffmpeg. In most cases (except for 
parts of swscale), we can just assume and rely on buffers being aligned 
enough for the SIMD vector length of the current platform, and freely 
overwrite a little into the padding at the end of the lines. Not sure if 
this is the case here though.

(If it is, it's easy enough to remove those bits and make the C functions 
static again as a follow-up.)

Also, checkasm coverage for >8bpp would be nice as mentioned, but if 
someone wants to write asm for that, it should be doable to factorize the 
new tests to run them for both 8 and 16 bpp.

That said, it looks ok enough to me so I'll push it.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  parent reply	other threads:[~2023-07-05 21:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-04 14:04 John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 1/7] tests/checkasm: Add test for vf_bwdif filter_intra John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 2/7] avfilter/vf_bwdif: Add neon for filter_intra John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 3/7] tests/checkasm: Add test for vf_bwdif filter_edge John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 4/7] avfilter/vf_bwdif: Add neon for filter_edge John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 5/7] avfilter/vf_bwdif: Add neon for filter_line John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 6/7] avfilter/vf_bwdif: Add a filter_line3 method for optimisation John Cox
2023-07-04 14:04 ` [FFmpeg-devel] [PATCH v4 7/7] avfilter/vf_bwdif: Add neon for filter_line3 John Cox
2023-07-05 21:19 ` Martin Storsjö [this message]
2023-07-06  9:30   ` [FFmpeg-devel] [PATCH v4 0/7] avfilter/vf_bwdif: Add aarch64 neon functions John Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2fc449f-836b-43b-32c6-7b5c575d5ad9@martin.st \
    --to=martin@martin.st \
    --cc=ffmpeg-devel@ffmpeg.org \
    --cc=jc@kynesim.co.uk \
    --cc=thomas.mundt@hr.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git