Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: "Martin Storsjö" <martin@martin.st>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Ben Avison <bavison@riscosopen.org>
Subject: Re: [FFmpeg-devel] [PATCH 0/6] avcodec/vc1: Arm optimisations
Date: Sun, 20 Mar 2022 01:06:03 +0200 (EET)
Message-ID: <d950efc7-e7ac-81c4-2b36-d9544a7b5f28@martin.st> (raw)
In-Reply-To: <20220317185819.466470-1-bavison@riscosopen.org>

Hi Ben,

On Thu, 17 Mar 2022, Ben Avison wrote:

> The VC1 decoder was missing lots of important fast paths for Arm, especially
> for 64-bit Arm. This submission fills in implementations for all functions
> where a fast path already existed and the fallback C implementation was
> taking 1% or more of the runtime, and adds a new fast path to permit
> vc1_unescape_buffer() to be overridden.
>
> I've measured the playback speed on a 1.5 GHz Cortex-A72 (Raspberry Pi 4)
> using `ffmpeg -i <bitstream> -f null -` for a couple of example streams:
>
> Architecture:  AArch32    AArch32    AArch64    AArch64
> Stream:        1          2          1          2
> Before speed:  1.22x      0.82x      1.00x      0.67x
> After speed:   1.31x      0.98x      1.39x      1.06x
> Improvement:   7.4%       20%        39%        58%
>
> `make fate` passes on both AArch32 and AArch64.

Thanks for the patches! I have looked briefly at the patches (I haven't 
started reading the implementation in detail yet though).

As you are writing assembly for these functions, I would very much 
appreciate if you could add checkasm tests for all the functions you're 
implementing. I see that there exists a test for the blockdsp functions, 
but all the other ones are missing a test.

I try to request such tests for all new assembly. Such a test allows 
testing all interesting cornercases of the DSP functions with one concise 
test, instead of having to run the full fate testsuite. It also allows 
catching a number of other possible lingering issues, like using the full 
64 bit register when the argument only set 32 bits where the upper bits 
are undefined, or missing to restore callee saved registers, etc. It also 
allows for easy benchmarking of the functions on their own, which is very 
useful for tuning of the implementation. And it finally allows easily 
checking that the assembly works correctly when built with a different 
toolchain for a different platform - without needing to run the full 
decoding tests.

Especially as you've been implementing the functions, you're probably more 
familiar with the expectations and behaviours (and potential cornercases 
that are worth testing) of the functions than most other developers in the 
community at the moment, which is good for writing useful testcases.

There's plenty of existing examples of such tests - the h264dsp, vp8dsp 
and vp9dsp cases might be relevant.


The other main issue I'd like to request is to indent the assembly 
similarly to the rest of the existing assembly. For the 32 bit assembly, 
your patches do match the surrounding code, but for the 64 bit assembly, 
your patches align the operands column differently than the rest. (I think 
your code aligns the operands with 16 chars to the left of the operands, 
while our code aligns it with 24 chars to the left, both in 32 and 64 bit 
arm assembly.)


Finally, the 32 bit assembly fails to build for me both with (recent) 
clang and old binutils, with errors like these:

src/libavcodec/arm/vc1dsp_neon.S: Assembler messages:
src/libavcodec/arm/vc1dsp_neon.S:1579: Error: bad type for scalar -- `vmov r0,d4[1]'
src/libavcodec/arm/vc1dsp_neon.S:1582: Error: bad type for scalar -- `vmov r2,d5[1]'
src/libavcodec/arm/vc1dsp_neon.S:1592: Error: bad type for scalar -- `vmov r2,d8[1]'
src/libavcodec/arm/vc1dsp_neon.S:1595: Error: bad type for scalar -- `vmov r12,d9[1]'

Qualifying the "vmov" into "vmov.32" seems to fix it.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  parent reply	other threads:[~2022-03-19 23:06 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-17 18:58 Ben Avison
2022-03-17 18:58 ` [FFmpeg-devel] [PATCH 1/6] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths Ben Avison
2022-03-17 18:58 ` [FFmpeg-devel] [PATCH 2/6] avcodec/vc1: Arm 32-bit " Ben Avison
2022-03-17 18:58 ` [FFmpeg-devel] [PATCH 3/6] avcodec/vc1: Arm 64-bit NEON inverse transform " Ben Avison
2022-03-17 18:58 ` [FFmpeg-devel] [PATCH 4/6] avcodec/idctdsp: Arm 64-bit NEON block add and clamp " Ben Avison
2022-03-17 18:58 ` [FFmpeg-devel] [PATCH 5/6] avcodec/blockdsp: Arm 64-bit NEON block clear " Ben Avison
2022-03-17 18:58 ` [FFmpeg-devel] [PATCH 6/6] avcodec/vc1: Introduce fast path for unescaping bitstream buffer Ben Avison
2022-03-18 19:10   ` Andreas Rheinhardt
2022-03-21 15:51     ` Ben Avison
2022-03-21 20:44       ` Martin Storsjö
2022-03-19 23:06 ` Martin Storsjö [this message]
2022-03-19 23:07   ` [FFmpeg-devel] [PATCH 0/6] avcodec/vc1: Arm optimisations Martin Storsjö
2022-03-21 17:37   ` Ben Avison
2022-03-21 22:29     ` Martin Storsjö
2022-03-25 18:52 ` [FFmpeg-devel] [PATCH v2 00/10] " Ben Avison
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests Ben Avison
2022-03-25 22:53     ` Martin Storsjö
2022-03-28 18:28       ` Ben Avison
2022-03-29 11:47         ` Martin Storsjö
2022-03-29 12:24     ` Martin Storsjö
2022-03-29 12:43     ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 02/10] checkasm: Add vc1dsp inverse transform tests Ben Avison
2022-03-29 12:41     ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 03/10] checkasm: Add idctdsp add/put-pixels-clamped tests Ben Avison
2022-03-29 13:13     ` Martin Storsjö
2022-03-29 19:56       ` Martin Storsjö
2022-03-29 20:22       ` Ben Avison
2022-03-29 20:30         ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 04/10] avcodec/vc1: Introduce fast path for unescaping bitstream buffer Ben Avison
2022-03-29 20:37     ` Martin Storsjö
2022-03-31 13:58       ` Ben Avison
2022-03-31 14:07         ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 05/10] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths Ben Avison
2022-03-30 12:35     ` Martin Storsjö
2022-03-31 15:15       ` Ben Avison
2022-03-31 21:21         ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 06/10] avcodec/vc1: Arm 32-bit " Ben Avison
2022-03-25 19:27     ` Lynne
2022-03-25 19:49       ` Martin Storsjö
2022-03-25 19:55         ` Lynne
2022-03-30 12:37     ` Martin Storsjö
2022-03-30 13:03     ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform " Ben Avison
2022-03-30 13:49     ` Martin Storsjö
2022-03-30 14:01       ` Martin Storsjö
2022-03-31 15:37       ` Ben Avison
2022-03-31 21:32         ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 08/10] avcodec/idctdsp: Arm 64-bit NEON block add and clamp " Ben Avison
2022-03-30 14:14     ` Martin Storsjö
2022-03-31 16:47       ` Ben Avison
2022-03-31 21:42         ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 09/10] avcodec/vc1: Arm 64-bit NEON unescape fast path Ben Avison
2022-03-30 14:35     ` Martin Storsjö
2022-03-25 18:52   ` [FFmpeg-devel] [PATCH 10/10] avcodec/vc1: Arm 32-bit " Ben Avison
2022-03-30 14:35     ` Martin Storsjö

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d950efc7-e7ac-81c4-2b36-d9544a7b5f28@martin.st \
    --to=martin@martin.st \
    --cc=bavison@riscosopen.org \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git