From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id BF88F4481C for ; Sun, 25 Sep 2022 21:17:34 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 88BCA68BA0D; Mon, 26 Sep 2022 00:17:31 +0300 (EEST) Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-oln040092074099.outbound.protection.outlook.com [40.92.74.99]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 51D3368B28D for ; Mon, 26 Sep 2022 00:17:25 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cjQwozZe4Lsic3m4bfn6NmUoBeUX60OJ1IdBOgHFhbQtQK+u1gj56QftJmy412lft59EFSMiGqtKCZKG9P3SLtUzAcDTskLzKXyUOo2Vfr5nFvFb6/0VMRVnsWKchMuz8dEsEHF9fvY7WWqLEZY/tUSHlGhgzO5auJlOqU9gg8ke4jOfIR1D1Ghzr7EISiu7g8AQpmmd2cuo+HRBzPXYaQlmUehyWJK1bugM/damdR0wLspF2kFd/EeKPyldcZJgOeYsNtX8Nd1AKJ+vgxHG6pPwJWPYbOK9gTgL2bDuxdixOVZT/V1j1LRkfwkfmpuqVI2EmbWAn8gysFhH0GznVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SGoJWhjLHHswaTHoHV0k0hLoRgL2ZrNnw8PuNWFLgB0=; b=MarcFBPlAgGYJbztP/Rh3Nhhp5VRl8I5cigaMYeDa1xOtrDhIn+SwQxb2o15qQ96HFbI3/ijk4oS+6dLp5bTRuYaJVGgCAtysKLNzP2yLmcW/Z5ScFrruzV/Nfr53+ASrpLx4+CGClCsnKgFTcAkqJ+o9bsDAnuQeSxTjEaeCgbif8wT7sy3xFb9dCj+3sM9Zmm83+pv6GyePOxFx13UbM7ERPJNk087d+NZ0NnmQ31V4JUl0GuTqmgxTCLT4Z8uXJ26OT9WDf0YdzGKqgGdDNXVPrDtT/izND5tMx2PU5mkTp8Id+gzZEd8F3YSf2JuxxM6r7+eGifRONyhnqS2OA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SGoJWhjLHHswaTHoHV0k0hLoRgL2ZrNnw8PuNWFLgB0=; b=WdJQ4AuRo7sXRfNmJJsdpRu4T8HDvd745q2ia/DXByQ18MIdZiMtnnOte5GJO9Ju7xlgGo0z7riMsA+mlAThkytMWxZE+bs9Hrj/aFS7vlJpMRdm4tLe1xixhTVFkE289jSkukpxMrbe2yz8bXWN5bbaAHFa9lD0aqJhXWdo81DJa4rDNnuXE3F0LiVlRYiczgR62zy5w6MHyEQGY2sAEKGCp7tk5Y17IvosNLx9pEiTLGQ2M0idnTL94Zxz5biGS+NCOydbKh6V42LHuQUf37w8ui2TJpExqNukZMPdIEfAJN1Mf9ct2FTNdAwbZV+VWvBWvyHmYkwSkzmA4U8eIQ== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by DU0P250MB0819.EURP250.PROD.OUTLOOK.COM (2603:10a6:10:3e3::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.24; Sun, 25 Sep 2022 21:17:22 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::85ac:1b92:90f:dc18]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::85ac:1b92:90f:dc18%4]) with mapi id 15.20.5654.016; Sun, 25 Sep 2022 21:17:22 +0000 Message-ID: Date: Sun, 25 Sep 2022 23:17:28 +0200 Content-Language: en-US To: ffmpeg-devel@ffmpeg.org References: <37cff64-511b-518d-769-f02c1fc7e49f@martin.st> <38345618-1535-1c53-6c28-52d0796e217@martin.st> From: Andreas Rheinhardt In-Reply-To: X-TMN: [eBwnO01vS+llY5lgJUzzFp8D7/Lw3ctI2qARtYUkCnE=] X-ClientProxiedBy: ZR2P278CA0001.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:50::9) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <141859e0-c027-eff1-0313-8579ab63752e@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|DU0P250MB0819:EE_ X-MS-Office365-Filtering-Correlation-Id: ed71f838-be15-4b7c-a468-08da9f3b564c X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: j2lbXxf5XqM1e8JtewCs5DWEN/KJzHF19EMFmto3kTU0XBw0/fMT//341SJ1YLf09ACAK8WkXVv5khem73xgdFj9HuMf/nqXfvJOOuFx0ebzqOiDgMnN9X75UKZqn5N/vk213tDpHeNkZAF3VBLO5EBszwvjZnsLofW8C5Om9zsFZO0skALX+rscGatiiWxYeGgc2HjIBSOm84zZCMpYsnvuXsfZsaaiQ/TCLYcdMDNpXcgAbYUVu4Q8w7piGtWPF39v6pSCMhGYFMxqAVdghp8udY/Lzzx/EzEFWfn7e8nAvixiyPQ+1daWKIV+tCxNGqEEImQIi3epOVCVdHVV7fiIMqQsp69mC+xh8Ns4fMyhqrNdHnkHPKcwPR5qJyZYIpNqNCBgAA6q1aqIRuO+0368vq90pAXdRdny4TfcdrFkNjKxku1/bg9XHXOwGA+EdNrtw3BXaK4+WvUjI67nrPLwI0OHWAJphbQeXmPfa42UwYLkLcNESz04/CItL3IiFxrW1lexdSjp5NHkGMJUiLJTvdQ5SGkeCCvhhbm4RohroiYwqyM4lfvmz+Df7ZIHcZyDx/4La3mxFSrA9VPj0159Ux4RJxZakwiL0E2ntp7i8bW/DdfVVSDmKGWkkxkAqt3/zQ+wrbzvc1HBZBOtKxfcQ9ST0VfWtkde7rZ8hjy44IfiQ23kNpHKWY0W17pr X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?dXpIazdWRW1Odk1JMERGZEFsSFZISDV0VThxdThidVlRbHR6Z0g3Ulphc1JU?= =?utf-8?B?WDdHN1pxMDI3ZFhPNDkwTEZnTGFqQ2lwejlQeFNBcTJ2TWhpazAzUzNKUy9v?= =?utf-8?B?UzhGbUxHSDV3TXM3eHN0Q2szc09vZm13bkt6VFhpNzg0OGpBRjZrcEh6VENa?= =?utf-8?B?ajExS2FXSUpENUpiS3R6UWovbWhjQ3ZrcnF5QzBCYU5vWmcvY0JDcHVDVDdU?= =?utf-8?B?cG1qcWx2WS91em51alNGS1R4NXZnNlhjQW40bThsZ0FYY28wNHhBcHZlL0R3?= =?utf-8?B?M0xiU0o5ZnBJeEVHSDQzaTQxRW1pazIrY0hKYm8wYjh0cXYwb2ZYT01JbnBF?= =?utf-8?B?MjFpVk5yL0JTODUxUTFtb053K1pSdHBQeUljZXRLbFZEUWhMb2g0bWZDUDgy?= =?utf-8?B?NjhzQXVLck5TdkNlbzViMCtsK25DblcyOWlLRkwrNUxLRHltOWRJaGRZdVZ5?= =?utf-8?B?cTFBZUlSS2hhLzNOWVhXM0tkT3lFMjA3VWlKMnQrcmZ0TllnTFkzRGkxcTBO?= =?utf-8?B?MFJRMWNYNDJVVE5IKzdYcmYrV1JXYUx2N0lwN01mT1BXbndlbXYxSTRWVHZ6?= =?utf-8?B?eFdGc09PSVp4OENKaWpkVmhLMTYrUFhuM2dKN1ArTXBkbm5lUTdIZktUMkEx?= =?utf-8?B?UWdVaXY3NmhwVytJeXlkZzJPT2I3V2NJd1FYSlMrOHdWOFAwcDEwZUtqUjNE?= =?utf-8?B?TkNEYUhtYUd2eG1HR1hTTCtMUGRLWEt2bCtFU0JsWTNwdFBOQk14TnJibWJa?= =?utf-8?B?R2VvakZsejFMeVE5eW9XdXVSc2xydGdDYmp2SHNLRG9BVVV2NXdPY0NOekkw?= =?utf-8?B?TWJuM0xMOUtKK0dWMCttZjgyREMxaWtiZ2MvdW1WdERqTk1VZmIwYXdkaWZV?= =?utf-8?B?eEJXSjhzajUvbHhEcVNxVE5LN3VKL2xHeGJVRWUrcHJnUGlyTE5yZlJ6OERJ?= =?utf-8?B?WDJQNWhHekxIblF4b3RLUlNoTm1RT2NJeG5hUmloNUJuaVQ1aXVDd0JaT0pY?= =?utf-8?B?WldIdFkxMFl2aituQy96MWEzU1kwR1ozMGRIbWtQb2pzdEpIRWwwSEpQTGVB?= =?utf-8?B?cDV1SUQ3SDdDNlNjNjl0YWFFUEF6MXBUb2JWSFp3MjM5ZUczNCtaMHp4cWlh?= =?utf-8?B?QXI2VXJsOE5HemhvSzlMQWFvVzYrVCtIeTV4dFZlNUc5SjFPK2NpSmRCa2NN?= =?utf-8?B?akZzSG9uNDV3WHRURzVYQUNWVUo0MEp1MDJ3Qkd5WjJIV1ZJSThHSmRORnA0?= =?utf-8?B?N2wyRnJKaFVLOGdka3FRSVBLc2VGUE1KNCtZRGR4cUhtWU9pNk5BZ1owNU1v?= =?utf-8?B?bzN3bkFaZVV4dENralREUGwyTHdWWS9SWlRQSW1ZeXp1Ti9xRm5sYVo0Mzhw?= =?utf-8?B?dGpvUzhicU9BMnUrVGRVa3J0Tkg4dWVkbXd3L0RzVy9XV0VlbEdyRGxSNVZ5?= =?utf-8?B?NjQ3NWJTN1JPRS9vL2RFbVljanltaEdzdDBpWlFNMDVscGd0Z3YvN3Vzd2lu?= =?utf-8?B?QnduSisrQ1RSajFZK0dYRWVRZlU0bGNWTXFnTXFpUTVhOGRBZ3dXSFU5dFJW?= =?utf-8?B?akp4SXV1ajB2RjNabHhzemt4bEpBc2VERGgyMm5McHFaWUlJUEZURktQcDla?= =?utf-8?B?VTF5bEFHNVdDVW1nbS9WTlRXbTJodUNTVjNGdkZpNDUwVGRGc0tZeUhTcG5s?= =?utf-8?B?RDRmd0FZTTRNNFJ0T0w5cy9mamoxcTBDcDRJUmNvUEZJVGJYUFZEcHJHSWpK?= =?utf-8?Q?aOet9bPnkmNtZMcE/qNGMzgwrhidd8l1ZwhBZEW?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: ed71f838-be15-4b7c-a468-08da9f3b564c X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2022 21:17:22.8283 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0P250MB0819 Subject: Re: [FFmpeg-devel] [PATCH 1/6] opus: convert encoder and decoder to lavu/tx X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Lynne: > Sep 25, 2022, 14:34 by andreas.rheinhardt@outlook.com: > >> Lynne: >> >>> Sep 24, 2022, 23:57 by dev@lynne.ee: >>> >>>> Sep 24, 2022, 21:40 by martin@martin.st: >>>> >>>>> What about ac3dsp then - that one seems like it's fairly optimized for arm? >>>>> >>>> Haven't touched them, they're still being used. Unfortunately, for AC3, >>>> the full MDCT optimizations in lavc do make a difference and the overall >>>> decoder becomes 15% slower with this patch on for aarch64 with lavu/tx's >>>> asm disabled and 7% slower with lavu/tx's asm enabled. I do plan to write >>>> an aarch64 MDCT NEON SIMD code in a month or so, unless someone is faster, >>>> which should make the decoder at least 10% faster with lavu/tx. >>>> >>> >>> I'd just like to add this was for the float version of the ac3 decoder. The fixed-point >>> version is a few percent faster with the patch on an A53, and quite a bit >>> more accurate. >>> The lavc fixed-point FFT code also has some weird large spikes in #cycles >>> for some transform sizes, so the figure above is an average, but the dips >>> went from 117x realtime to 78x realtime, which on a slower CPU may >>> be the difference between stuttering and realtime playback. >>> On this CPU, the fixed-point version is 23% slower than the float version, >>> but on a CPU with slower float ops, it would make more sense to pick that >>> decoder up than the float version. >>> The 2 decoders produce nearly identical results, minus a few rounding >>> errors, since AC3 is inherently a fixed-point codec. The only difference >>> are the transforms themselves, and the extra ops needed to convert >>> the 25bit ints to floats in the float decoder. >>> >> >> 1. You forgot to remove mdct15 requirements from configure in this whole >> patchset. >> 2. You forgot to update the FATE references for several tests; e.g. when >> only applying the ac3 patch, then I get this: >> > > I know. durandal pointed it out the day I sent them. I'll send them again > later. > I'm planning to just push the Opus patch in a day with the mdct15 > line in configure gone. > > >> As the above shows, the difference between the reference files and the >> decoded output becomes larger in several tests, i.e. the reference files >> won't be usable lateron. If the new float and fixed-point decoders >> produce indeed produce nearly identical output, then one could write >> tests that decode the same file with both the floating point and the >> fixed point decoder, check that both are nearly identical and print a >> checksum of the output of the fixed point decoder. >> > > I have a standalone program I've hacked on as I need to for the fixed-point > transforms: https://0x0.st/oWxO.c > The square root of the squared rounding error across the entire range > (1 to 21 bits) of transforms from 32pt to 1024pt is 6.855655 for lavu and > 7.141428 for lavc, which is slightly worse. If you extend the range > to 22bits, the 1024pt transform in lavc explodes, while lavu is still fine, > thus showing a greater range. > The rounding errors are a lesser problem than hitting the max range, > because then you get huge spikes in the output. > I can further reduce the error in lavu at the cost of speed, but I think > this is sufficient. > > >> Also note that there is currently no test that directly verifies your >> claims of greater accuracy. One could write such a test by encoding a >> file with ac3-fixed and decoding it again (with the fixed point decoder) >> and printing the psnr of input and output. No encoding tests does this >> at the moment. >> > > I'm not writing that, but I like the idea, the point of fixed-point decoders > isn't bitexactness, but speed on slow hardware, so we shouldn't be testing > an MD5. Are your fixed-point transforms bitexact across all arches/cpuflags? - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".