Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 3/3] avcodec/aarch64: add hevc deblock NEON
Date: Wed, 21 Feb 2024 14:08:16 +0200 (EET)
Message-ID: <e66cf825-93e7-1b2-3fa5-2ab6972789bc@martin.st> (raw)
In-Reply-To: <20240221111003.185240-3-jdek@itanimul.li>

On Wed, 21 Feb 2024, J. Dekker wrote:

> Benched using single-threaded full decode on an Ampere Altra.
>
> Bpp Before  After  Speedup
> 8   73,3s   65,2s  1.124x
> 10  114,2s  104,0s 1.098x
> 12  125,8s  115,7s 1.087x
>
> Signed-off-by: J. Dekker <jdek@itanimul.li>
> ---
> libavcodec/aarch64/hevcdsp_deblock_neon.S | 421 ++++++++++++++++++++++
> libavcodec/aarch64/hevcdsp_init_aarch64.c |  18 +
> 2 files changed, 439 insertions(+)

> +0:      // STRONG FILTER
> +
> +        // P0 = p0 + av_clip(((p2 + 2 * p1 + 2 * p0 + 2 * q0 + q1 + 4) >> 3) - p0, -tc3, tc3);
> +        add             v21.8h, v2.8h, v3.8h   // (p1 + p0
> +        add             v21.8h, v4.8h, v21.8h  //     + q0)
> +        shl             v21.8h, v21.8h, #1     //           * 2
> +        add             v22.8h, v1.8h, v5.8h   //   (p2 + q1)
> +        add             v21.8h, v22.8h, v21.8h // +
> +        srshr            v21.8h, v21.8h, #3     //               >> 3
> +        sub             v21.8h, v21.8h, v3.8h  //                    - p0
> +

The srshr line is incorrectly indented here (and elsewhere)

> +        sqxtun          v4.8b, v4.8h
> +        sqxtun          v5.8b, v5.8h
> +        sqxtun          v6.8b, v6.8h
> +        sqxtun          v7.8b, v7.8h
> +.endif
> +        ret
> +3:      ret x6

Please indent the "x6" here like other operands

> +.macro hevc_loop_filter_luma dir bitdepth
> +function ff_hevc_\dir\()_loop_filter_luma_\bitdepth\()_neon, export=1
> +        mov             x6, x30
> +.if \dir == v

In GAS assembler, .if does a numerical comparison - it can't do string 
comparisons.

The right way to do this is to do ".ifc \dir, v", which does a string 
comparison.

(If you really do need to do this like a numerical comparison, it's 
possible to define e.g. "v" as a numeric symbol as well, see e.g. 
https://code.videolan.org/videolan/dav1d/-/merge_requests/1603/diffs?commit_id=d4746c908c56cb2e8545efd348b8cdc13f2f2253 
but that's not really the nicest way to do it.)

This issue breaks compilation with Clang. With gas-preprocessor (for 
MSVC), it manages to build correctly, but does the wrong thing.


To avoid me having to test all these build configurations manually, 
remembering to check all these corner case build configurations and check 
indentation and all, I've set up a PoC for testing such things on Github 
Actions.

If you have a repo on github, grab my commits from 
https://github.com/mstorsjo/FFmpeg/commits/gha-aarch64 (there are a couple 
of them), add your changes on top of these, and push it as a branch to 
your own github repo, then check the output from the actions.

Here's the output of a run with the patches you just posted: 
https://github.com/mstorsjo/FFmpeg/actions/runs/7988312683

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

      reply	other threads:[~2024-02-21 12:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-21 11:10 [FFmpeg-devel] [PATCH 1/3] checkasm/hevc_deblock: add luma and chroma full J. Dekker
2024-02-21 11:10 ` [FFmpeg-devel] [PATCH 2/3] avcodec/x86: disable hevc 12b luma deblock J. Dekker
2024-02-23  3:14   ` Andreas Rheinhardt
2024-02-24  9:54     ` J. Dekker
2024-02-23 12:45   ` Nuo Mi
2024-02-24  9:49     ` J. Dekker
2024-02-24 10:46       ` Martin Storsjö
2024-02-24 18:32         ` J. Dekker
2024-02-21 11:10 ` [FFmpeg-devel] [PATCH 3/3] avcodec/aarch64: add hevc deblock NEON J. Dekker
2024-02-21 12:08   ` Martin Storsjö [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e66cf825-93e7-1b2-3fa5-2ab6972789bc@martin.st \
    --to=martin@martin.st \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git