From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by master.gitmailbox.com (Postfix) with ESMTP id CF03C44820
	for <ffmpegdev@gitmailbox.com>; Sun, 25 Sep 2022 19:55:51 +0000 (UTC)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1DA2C68B99F;
	Sun, 25 Sep 2022 22:55:49 +0300 (EEST)
Received: from mail8.parnet.fi (mail8.parnet.fi [77.234.108.134])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 07702689CCC
 for <ffmpeg-devel@ffmpeg.org>; Sun, 25 Sep 2022 22:55:41 +0300 (EEST)
Received: from mail9.parnet.fi (mail9.parnet.fi [77.234.108.21])
 by mail8.parnet.fi  with ESMTP id 28PJteYR008882-28PJteYS008882;
 Sun, 25 Sep 2022 22:55:40 +0300
Received: from foo.martin.st (host-97-187.parnet.fi [77.234.97.187])
 by mail9.parnet.fi (Postfix) with ESMTPS id E9DA7A1467;
 Sun, 25 Sep 2022 22:55:39 +0300 (EEST)
Date: Sun, 25 Sep 2022 22:55:37 +0300 (EEST)
From: =?ISO-8859-15?Q?Martin_Storsj=F6?= <martin@martin.st>
To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
In-Reply-To: <NClNyyy--3-2@lynne.ee>
Message-ID: <f5a6813a-1c72-cbb3-6b60-b857d67e431@martin.st>
References: <NCgcUxK--3-2@lynne.ee>
 <37cff64-511b-518d-769-f02c1fc7e49f@martin.st>
 <CA+anqdwx3s=coX99ZhXjFJzT4qth9veAhx4JA-1BuxxAeBPxrg@mail.gmail.com>
 <CA+anqdy_Mi=J1=Kg7nFHtSw_6nQshVtUzWM+hoN7zpo142nYZA@mail.gmail.com>
 <38345618-1535-1c53-6c28-52d0796e217@martin.st>
 <NClNyyy--3-2@lynne.ee>
MIME-Version: 1.0
X-FE-Policy-ID: 3:14:2:SYSTEM
Subject: Re: [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder to
 lavu/tx
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Ben Avison <bavison@riscosopen.org>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Archived-At: <https://master.gitmailbox.com/ffmpegdev/f5a6813a-1c72-cbb3-6b60-b857d67e431@martin.st/>
List-Archive: <https://master.gitmailbox.com/ffmpegdev/>
List-Post: <mailto:ffmpegdev@gitmailbox.com>

On Sat, 24 Sep 2022, Lynne wrote:

>> What about ac3dsp then - that one seems like it's fairly optimized for arm?
>>
>
> Haven't touched them, they're still being used. Unfortunately, for AC3,
> the full MDCT optimizations in lavc do make a difference and the overall
> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's
> asm disabled and 7% slower with lavu/tx's asm enabled.

Hmm, that's a shame...

> I do plan to write an aarch64 MDCT NEON SIMD code in a month or so, 
> unless someone is faster, which should make the decoder at least 10% 
> faster with lavu/tx.

Would you consider holding off of converting the ac3 decoder until this 
point, to avoid unnecessary temporary performance regressions at least for 
the architectures that are covered by the new lavu/tx framework?

> If you'd like to help out, I've documented the C factorizations used in
> docs/transforms.md.

Sorry, I don't think I have time at the moment to take on writing new code 
from scratch for this...

I could maybe consider porting the aarch64 assembly to arm32; if it's not 
register starved, it's usually quite straightforward to do such rewrites 
(there's either half the number of SIMD registers compared to aarch64, or 
the same number but half the length).


The reason why I'm asking about arm32, is because ffmpeg has got a bunch 
of users who have spent a fair amount of effort on reaching specific 
performance levels for some codecs, both for raspberry pi 1 (which doesn't 
have neon but only vfp) and for the newer ones with neon. I don't remember 
exactly which codecs are relevant for these users - I doubt opus is, but 
ac3 and dca are, iirc.

I'm CCing Ben Avison who has contributed a lot of optimizations in this 
area.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".