Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Lynne <dev@lynne.ee>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder to lavu/tx
Date: Sun, 25 Sep 2022 22:45:12 +0200 (CEST)
Message-ID: <NCqOOQ8--3-2@lynne.ee> (raw)
In-Reply-To: <f5a6813a-1c72-cbb3-6b60-b857d67e431@martin.st>

Sep 25, 2022, 21:55 by martin@martin.st:

> On Sat, 24 Sep 2022, Lynne wrote:
>
>>> What about ac3dsp then - that one seems like it's fairly optimized for arm?
>>>
>>
>> Haven't touched them, they're still being used. Unfortunately, for AC3,
>> the full MDCT optimizations in lavc do make a difference and the overall
>> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's
>> asm disabled and 7% slower with lavu/tx's asm enabled.
>>
>
> Hmm, that's a shame...
>
>> I do plan to write an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster, which should make the decoder at least 10% faster with lavu/tx.
>>
>
> Would you consider holding off of converting the ac3 decoder until this point, to avoid unnecessary temporary performance regressions at least for the architectures that are covered by the new lavu/tx framework?
>
>> If you'd like to help out, I've documented the C factorizations used in
>> docs/transforms.md.
>>
>
> Sorry, I don't think I have time at the moment to take on writing new code from scratch for this...
>
> I could maybe consider porting the aarch64 assembly to arm32; if it's not register starved, it's usually quite straightforward to do such rewrites (there's either half the number of SIMD registers compared to aarch64, or the same number but half the length)
>

For the basis transforms (double 4, double 8 and 8, single 16), there's no starvation.
For the 32pt transform, it's a bit starved, but nothing you couldn't work out.
For the 64pt and up, absolutely all registers are used to the point of needing to
stash vector regs across gprs. If all registers are written back to memory (no register
sharing between transform sizes), it becomes as starved as the 32pt.
It's obvious to see where the starvation happens (only 32pt -> 64pt) and how to fix it,
but it's still work to convert code. Take a look at it and see if you can spot something
that would make it difficult?


> The reason why I'm asking about arm32, is because ffmpeg has got a bunch of users who have spent a fair amount of effort on reaching specific performance levels for some codecs, both for raspberry pi 1 (which doesn't have neon but only vfp) and for the newer ones with neon. I don't remember exactly which codecs are relevant for these users - I doubt opus is, but ac3 and dca are, iirc.
>

We do maintain old versions for years after a release. And we recently-ish
had a major bump, and very recently 5.1. I think there's enough time to
bring them back up and make them faster still before stuck users become
quite outdated, what about you? Maybe someone who's interested could
notice and help out?


> I'm CCing Ben Avison who has contributed a lot of optimizations in this area.
>

Thanks.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2022-09-25 20:45 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 23:14 Lynne
     [not found] ` <NCgcUxK--3-2@lynne.ee-NCgcZNj----2>
2022-09-23 23:15   ` [FFmpeg-devel] [PATCH 2/6] atrac9dec: switch " Lynne
     [not found]   ` <NCgciJh--3-2@lynne.ee-NCgclLI----2>
2022-09-23 23:18     ` [FFmpeg-devel] [PATCH 3/6] ac3: convert encoder and decoder " Lynne
     [not found]     ` <NCgdFqI--B-2@lynne.ee-NCgdIwE----2>
2022-09-23 23:18       ` [FFmpeg-devel] [PATCH 4/6] vorbisdec: convert " Lynne
     [not found]       ` <NCgdOA8--3-2@lynne.ee-NCgdR4N----2>
2022-09-23 23:19         ` [FFmpeg-devel] [PATCH 5/6] twinvq: " Lynne
     [not found]         ` <NCgdYSD--3-2@lynne.ee-NCgdaK4----2>
2022-09-23 23:20           ` [FFmpeg-devel] [PATCH 6/6] wmaprodec: " Lynne
2022-09-25 12:38             ` Andreas Rheinhardt
2022-09-24 18:42 ` [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder " Martin Storsjö
2022-09-24 19:26   ` Hendrik Leppkes
2022-09-24 19:31     ` Hendrik Leppkes
2022-09-24 19:40       ` Martin Storsjö
2022-09-24 21:57         ` Lynne
2022-09-25 19:55           ` Martin Storsjö
2022-09-25 20:45             ` Lynne [this message]
     [not found]         ` <NClNyyy--3-2@lynne.ee-NClVNO6----2>
2022-09-25  7:54           ` Lynne
2022-09-25 12:34             ` Andreas Rheinhardt
2022-09-25 21:08               ` Lynne
2022-09-25 21:17                 ` Andreas Rheinhardt
2022-09-25 21:46                   ` Lynne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=NCqOOQ8--3-2@lynne.ee \
    --to=dev@lynne.ee \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git