Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Hubert Mazur <hum@semihalf.com>
To: "Martin Storsjö" <martin@martin.st>
Cc: gjb@semihalf.com, upstream@semihalf.com, jswinney@amazon.com,
	ffmpeg-devel@ffmpeg.org, mw@semihalf.com, spop@amazon.com
Subject: Re: [FFmpeg-devel] [PATCH 4/4] sw_scale: Add specializations for hscale 16 to 19
Date: Tue, 25 Oct 2022 09:14:50 +0200
Message-ID: <CAGEip35hgrFEhD=YtAk9AJMDrxEH4HcbxMgTAhn-stpSrPJg0w@mail.gmail.com> (raw)
In-Reply-To: <757bb051-c5c3-91cb-cb7a-f5854bc6d26e@martin.st>

Thanks for the review.
I will fix the failing checkasm first and then take care of the minor
issues. I will try to to resend fixed versions this week.

Regards,
Hubert

On Mon, Oct 24, 2022 at 3:19 PM Martin Storsjö <martin@martin.st> wrote:

> On Mon, 17 Oct 2022, Hubert Mazur wrote:
>
> > Provide arm64 neon optimized implementations for hscale16To19 with
> > filter sizes 4, 8 and X4.
> >
> > The tests and benchmarks run on AWS Graviton 2 instances.
> > The results from a checkasm tool are shown below.
> >
> > hscale_16_to_19__fs_4_dstW_512_c: 6216.0
> > hscale_16_to_19__fs_4_dstW_512_neon: 2257.0
> > hscale_16_to_19__fs_8_dstW_512_c: 10417.7
> > hscale_16_to_19__fs_8_dstW_512_neon: 3112.5
> > hscale_16_to_19__fs_12_dstW_512_c: 14890.5
> > hscale_16_to_19__fs_12_dstW_512_neon: 3899.0
> > hscale_16_to_19__fs_16_dstW_512_c: 19006.5
> > hscale_16_to_19__fs_16_dstW_512_neon: 5341.2
> > hscale_16_to_19__fs_32_dstW_512_c: 36629.5
> > hscale_16_to_19__fs_32_dstW_512_neon: 9502.7
> > hscale_16_to_19__fs_40_dstW_512_c: 45477.5
> > hscale_16_to_19__fs_40_dstW_512_neon: 11552.0
> >
> > Signed-off-by: Hubert Mazur <hum@semihalf.com>
> > ---
> > libswscale/aarch64/hscale.S  | 402 +++++++++++++++++++++++++++++++++++
> > libswscale/aarch64/swscale.c |  70 +++++-
> > 2 files changed, 471 insertions(+), 1 deletion(-)
>
> > +void ff_hscale16to19_4_neon_asm(int shift, int16_t *_dst, int dstW,
> > +                      const uint8_t *_src, const int16_t *filter,
> > +                      const int32_t *filterPos, int filterSize);
> > +void ff_hscale16to19_X8_neon_asm(int shift, int16_t *_dst, int dstW,
> > +                      const uint8_t *_src, const int16_t *filter,
> > +                      const int32_t *filterPos, int filterSize);
> > +void ff_hscale16to19_X4_neon_asm(int shift, int16_t *_dst, int dstW,
> > +                      const uint8_t *_src, const int16_t *filter,
> > +                      const int32_t *filterPos, int filterSize);
> > +
> > #define SCALE_FUNC(filter_n, from_bpc, to_bpc, opt) \
> > void ff_hscale ## from_bpc ## to ## to_bpc ## _ ## filter_n ## _ ## opt(
> \
> >                                                 SwsContext *c, int16_t
> *data, \
> > @@ -43,7 +53,8 @@ void ff_hscale ## from_bpc ## to ## to_bpc ## _ ##
> filter_n ## _ ## opt( \
> > #define SCALE_FUNCS(filter_n, opt) \
> >     SCALE_FUNC(filter_n,  8, 15, opt); \
> >     SCALE_FUNC(filter_n, 8, 19, opt); \
> > -    SCALE_FUNC(filter_n, 16, 15, opt);
> > +    SCALE_FUNC(filter_n, 16, 15, opt); \
> > +    SCALE_FUNC(filter_n, 16, 19, opt);
>
> So this declares the functions we're implementing as C wrappers below, and
> the manual declarations further up declare the actual asm functions?
>
> I guess that works, although it makes unnecessary extern functions. In
> such cases, we usually have the C functions be static functions, placed
> above the code that uses them. But it's not a big deal.
>
> Other than that, this patchset mostly seems fine.
>
> However, I tested the patches on x86, and the new checkasm tests do fail
> on x86 (both i386 and x86_64) - so that needs to be fixed anyway. So since
> we'll need to do a new round anyway, please do try to fix up the minor
> cosmetics I mentioned.
>
> // Martin
>
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

      reply	other threads:[~2022-10-25  7:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17 13:07 [FFmpeg-devel] [PATCH 0/4] Provide neon implementations for hscale functions Hubert Mazur
2022-10-17 13:07 ` [FFmpeg-devel] [PATCH 1/4] sw_scale: Add specializations for hscale 8 to 19 Hubert Mazur
2022-10-24 12:31   ` Martin Storsjö
2022-10-17 13:07 ` [FFmpeg-devel] [PATCH 2/4] tests/sw_scale: Add test cases for input sizes 16 Hubert Mazur
2022-10-17 13:07 ` [FFmpeg-devel] [PATCH 3/4] sw_scale: Add specializations for hscale 16 to 15 Hubert Mazur
2022-10-17 13:07 ` [FFmpeg-devel] [PATCH 4/4] sw_scale: Add specializations for hscale 16 to 19 Hubert Mazur
2022-10-24 13:19   ` Martin Storsjö
2022-10-25  7:14     ` Hubert Mazur [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGEip35hgrFEhD=YtAk9AJMDrxEH4HcbxMgTAhn-stpSrPJg0w@mail.gmail.com' \
    --to=hum@semihalf.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    --cc=gjb@semihalf.com \
    --cc=jswinney@amazon.com \
    --cc=martin@martin.st \
    --cc=mw@semihalf.com \
    --cc=spop@amazon.com \
    --cc=upstream@semihalf.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git