From: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder to lavu/tx
Date: Sun, 25 Sep 2022 23:17:28 +0200
Message-ID: <AS8P250MB0744B23538CC2C2BD1ABA8498F539@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <NCqThVx--3-2@lynne.ee>
Lynne:
> Sep 25, 2022, 14:34 by andreas.rheinhardt@outlook.com:
>
>> Lynne:
>>
>>> Sep 24, 2022, 23:57 by dev@lynne.ee:
>>>
>>>> Sep 24, 2022, 21:40 by martin@martin.st:
>>>>
>>>>> What about ac3dsp then - that one seems like it's fairly optimized for arm?
>>>>>
>>>> Haven't touched them, they're still being used. Unfortunately, for AC3,
>>>> the full MDCT optimizations in lavc do make a difference and the overall
>>>> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's
>>>> asm disabled and 7% slower with lavu/tx's asm enabled. I do plan to write
>>>> an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster,
>>>> which should make the decoder at least 10% faster with lavu/tx.
>>>>
>>>
>>> I'd just like to add this was for the float version of the ac3 decoder. The fixed-point
>>> version is a few percent faster with the patch on an A53, and quite a bit
>>> more accurate.
>>> The lavc fixed-point FFT code also has some weird large spikes in #cycles
>>> for some transform sizes, so the figure above is an average, but the dips
>>> went from 117x realtime to 78x realtime, which on a slower CPU may
>>> be the difference between stuttering and realtime playback.
>>> On this CPU, the fixed-point version is 23% slower than the float version,
>>> but on a CPU with slower float ops, it would make more sense to pick that
>>> decoder up than the float version.
>>> The 2 decoders produce nearly identical results, minus a few rounding
>>> errors, since AC3 is inherently a fixed-point codec. The only difference
>>> are the transforms themselves, and the extra ops needed to convert
>>> the 25bit ints to floats in the float decoder.
>>>
>>
>> 1. You forgot to remove mdct15 requirements from configure in this whole
>> patchset.
>> 2. You forgot to update the FATE references for several tests; e.g. when
>> only applying the ac3 patch, then I get this:
>>
>
> I know. durandal pointed it out the day I sent them. I'll send them again
> later.
> I'm planning to just push the Opus patch in a day with the mdct15
> line in configure gone.
>
>
>> As the above shows, the difference between the reference files and the
>> decoded output becomes larger in several tests, i.e. the reference files
>> won't be usable lateron. If the new float and fixed-point decoders
>> produce indeed produce nearly identical output, then one could write
>> tests that decode the same file with both the floating point and the
>> fixed point decoder, check that both are nearly identical and print a
>> checksum of the output of the fixed point decoder.
>>
>
> I have a standalone program I've hacked on as I need to for the fixed-point
> transforms: https://0x0.st/oWxO.c
> The square root of the squared rounding error across the entire range
> (1 to 21 bits) of transforms from 32pt to 1024pt is 6.855655 for lavu and
> 7.141428 for lavc, which is slightly worse. If you extend the range
> to 22bits, the 1024pt transform in lavc explodes, while lavu is still fine,
> thus showing a greater range.
> The rounding errors are a lesser problem than hitting the max range,
> because then you get huge spikes in the output.
> I can further reduce the error in lavu at the cost of speed, but I think
> this is sufficient.
>
>
>> Also note that there is currently no test that directly verifies your
>> claims of greater accuracy. One could write such a test by encoding a
>> file with ac3-fixed and decoding it again (with the fixed point decoder)
>> and printing the psnr of input and output. No encoding tests does this
>> at the moment.
>>
>
> I'm not writing that, but I like the idea, the point of fixed-point decoders
> isn't bitexactness, but speed on slow hardware, so we shouldn't be testing
> an MD5.
Are your fixed-point transforms bitexact across all arches/cpuflags?
- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
next prev parent reply other threads:[~2022-09-25 21:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-23 23:14 Lynne
[not found] ` <NCgcUxK--3-2@lynne.ee-NCgcZNj----2>
2022-09-23 23:15 ` [FFmpeg-devel] [PATCH 2/6] atrac9dec: switch " Lynne
[not found] ` <NCgciJh--3-2@lynne.ee-NCgclLI----2>
2022-09-23 23:18 ` [FFmpeg-devel] [PATCH 3/6] ac3: convert encoder and decoder " Lynne
[not found] ` <NCgdFqI--B-2@lynne.ee-NCgdIwE----2>
2022-09-23 23:18 ` [FFmpeg-devel] [PATCH 4/6] vorbisdec: convert " Lynne
[not found] ` <NCgdOA8--3-2@lynne.ee-NCgdR4N----2>
2022-09-23 23:19 ` [FFmpeg-devel] [PATCH 5/6] twinvq: " Lynne
[not found] ` <NCgdYSD--3-2@lynne.ee-NCgdaK4----2>
2022-09-23 23:20 ` [FFmpeg-devel] [PATCH 6/6] wmaprodec: " Lynne
2022-09-25 12:38 ` Andreas Rheinhardt
2022-09-24 18:42 ` [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder " Martin Storsjö
2022-09-24 19:26 ` Hendrik Leppkes
2022-09-24 19:31 ` Hendrik Leppkes
2022-09-24 19:40 ` Martin Storsjö
2022-09-24 21:57 ` Lynne
2022-09-25 19:55 ` Martin Storsjö
2022-09-25 20:45 ` Lynne
[not found] ` <NClNyyy--3-2@lynne.ee-NClVNO6----2>
2022-09-25 7:54 ` Lynne
2022-09-25 12:34 ` Andreas Rheinhardt
2022-09-25 21:08 ` Lynne
2022-09-25 21:17 ` Andreas Rheinhardt [this message]
2022-09-25 21:46 ` Lynne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AS8P250MB0744B23538CC2C2BD1ABA8498F539@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM \
--to=andreas.rheinhardt@outlook.com \
--cc=ffmpeg-devel@ffmpeg.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git