Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
From: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To: ffmpeg-devel@ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder to lavu/tx
Date: Sun, 25 Sep 2022 23:17:28 +0200
Message-ID: <AS8P250MB0744B23538CC2C2BD1ABA8498F539@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <NCqThVx--3-2@lynne.ee>

Lynne:
> Sep 25, 2022, 14:34 by andreas.rheinhardt@outlook.com:
> 
>> Lynne:
>>
>>> Sep 24, 2022, 23:57 by dev@lynne.ee:
>>>
>>>> Sep 24, 2022, 21:40 by martin@martin.st:
>>>>
>>>>> What about ac3dsp then - that one seems like it's fairly optimized for arm?
>>>>>
>>>> Haven't touched them, they're still being used. Unfortunately, for AC3,
>>>> the full MDCT optimizations in lavc do make a difference and the overall
>>>> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's
>>>> asm disabled and 7% slower with lavu/tx's asm enabled. I do plan to write
>>>> an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster,
>>>> which should make the decoder at least 10% faster with lavu/tx.
>>>>
>>>
>>> I'd just like to add this was for the float version of the ac3 decoder. The fixed-point
>>> version is a few percent faster with the patch on an A53, and quite a bit
>>> more accurate.
>>> The lavc fixed-point FFT code also has some weird large spikes in #cycles
>>> for some transform sizes, so the figure above is an average, but the dips
>>> went from 117x realtime to 78x realtime, which on a slower CPU may
>>> be the difference between stuttering and realtime playback.
>>> On this CPU, the fixed-point version is 23% slower than the float version,
>>> but on a CPU with slower float ops, it would make more sense to pick that
>>> decoder up than the float version.
>>> The 2 decoders produce nearly identical results, minus a few rounding
>>> errors, since AC3 is inherently a fixed-point codec. The only difference
>>> are the transforms themselves, and the extra ops needed to convert
>>> the 25bit ints to floats in the float decoder.
>>>
>>
>> 1. You forgot to remove mdct15 requirements from configure in this whole
>> patchset.
>> 2. You forgot to update the FATE references for several tests; e.g. when
>> only applying the ac3 patch, then I get this:
>>
> 
> I know. durandal pointed it out the day I sent them. I'll send them again
> later.
> I'm planning to just push the Opus patch in a day with the mdct15
> line in configure gone.
> 
> 
>> As the above shows, the difference between the reference files and the
>> decoded output becomes larger in several tests, i.e. the reference files
>> won't be usable lateron. If the new float and fixed-point decoders
>> produce indeed produce nearly identical output, then one could write
>> tests that decode the same file with both the floating point and the
>> fixed point decoder, check that both are nearly identical and print a
>> checksum of the output of the fixed point decoder.
>>
> 
> I have a standalone program I've hacked on as I need to for the fixed-point
> transforms: https://0x0.st/oWxO.c
> The square root of the squared rounding error across the entire range
> (1 to 21 bits) of transforms from 32pt to 1024pt is 6.855655 for lavu and
> 7.141428 for lavc, which is slightly worse. If you extend the range
> to 22bits, the 1024pt transform in lavc explodes, while lavu is still fine,
> thus showing a greater range.
> The rounding errors are a lesser problem than hitting the max range,
> because then you get huge spikes in the output.
> I can further reduce the error in lavu at the cost of speed, but I think
> this is sufficient.
> 
> 
>> Also note that there is currently no test that directly verifies your
>> claims of greater accuracy. One could write such a test by encoding a
>> file with ac3-fixed and decoding it again (with the fixed point decoder)
>> and printing the psnr of input and output. No encoding tests does this
>> at the moment.
>>
> 
> I'm not writing that, but I like the idea, the point of fixed-point decoders
> isn't bitexactness, but speed on slow hardware, so we shouldn't be testing
> an MD5.

Are your fixed-point transforms bitexact across all arches/cpuflags?

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

  reply	other threads:[~2022-09-25 21:17 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 23:14 Lynne
     [not found] ` <NCgcUxK--3-2@lynne.ee-NCgcZNj----2>
2022-09-23 23:15   ` [FFmpeg-devel] [PATCH 2/6] atrac9dec: switch " Lynne
     [not found]   ` <NCgciJh--3-2@lynne.ee-NCgclLI----2>
2022-09-23 23:18     ` [FFmpeg-devel] [PATCH 3/6] ac3: convert encoder and decoder " Lynne
     [not found]     ` <NCgdFqI--B-2@lynne.ee-NCgdIwE----2>
2022-09-23 23:18       ` [FFmpeg-devel] [PATCH 4/6] vorbisdec: convert " Lynne
     [not found]       ` <NCgdOA8--3-2@lynne.ee-NCgdR4N----2>
2022-09-23 23:19         ` [FFmpeg-devel] [PATCH 5/6] twinvq: " Lynne
     [not found]         ` <NCgdYSD--3-2@lynne.ee-NCgdaK4----2>
2022-09-23 23:20           ` [FFmpeg-devel] [PATCH 6/6] wmaprodec: " Lynne
2022-09-25 12:38             ` Andreas Rheinhardt
2022-09-24 18:42 ` [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder " Martin Storsjö
2022-09-24 19:26   ` Hendrik Leppkes
2022-09-24 19:31     ` Hendrik Leppkes
2022-09-24 19:40       ` Martin Storsjö
2022-09-24 21:57         ` Lynne
2022-09-25 19:55           ` Martin Storsjö
2022-09-25 20:45             ` Lynne
     [not found]         ` <NClNyyy--3-2@lynne.ee-NClVNO6----2>
2022-09-25  7:54           ` Lynne
2022-09-25 12:34             ` Andreas Rheinhardt
2022-09-25 21:08               ` Lynne
2022-09-25 21:17                 ` Andreas Rheinhardt [this message]
2022-09-25 21:46                   ` Lynne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AS8P250MB0744B23538CC2C2BD1ABA8498F539@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM \
    --to=andreas.rheinhardt@outlook.com \
    --cc=ffmpeg-devel@ffmpeg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git