From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id EEA1F4534E for ; Sat, 23 Dec 2023 11:45:31 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4AD8F68D0CB; Sat, 23 Dec 2023 13:45:28 +0200 (EET) Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05olkn2067.outbound.protection.outlook.com [40.92.90.67]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E2C14680101 for ; Sat, 23 Dec 2023 13:45:20 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TOso+Wbuw71m2Xnyl+4e/Vrq8mKhM4Ovzm1kNqh0wuL/8qnlN2bhwYZ4ZJ+QKjSIjC11/LgeRV3Y9jx07FZGUBowkvs0uOFMqh2yHt7zesEmIpvXN5t+f9skQkrTPDlIb795jOYVV6Jg/1jMKz+i/miKZA1krDivAeboz8TFSybQW+Im3gDQKOWjHklzhn5cs5zMfvoQbfVF0GcZh1QU/OTrkHcqH/vCzgYSI3PiGv8z1ovdM92SUvvgPVMQc9hbGRkQDEEGhbSmxak8KusJf5il8W8w+1zX2jcaDJ5MxDhKKEejKQ+eT784WinvIBilfwKHuKtLRzajxaSWT9pBIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bKutQv0///4j/gs5pQy7gCHsKQUJG1y4JVrMd8eMQzU=; b=dITjkbN5/hBscBN4QLt/C8botyLMpFNQvA0EFct21cOrx9Lz6hJ8kHxFHyQ2hF+gLfz4Cs+XYE+HhLhcX8b8mT0m/jmdOeuqoqpQtqyo/26ulyk9yVKLJ48BhIudtIAHutF8CCDY3ps/BHEU8VzgV9D/rUMhhZhmUXclX1O8dP+5qSgaiWhvCPfrhE+WaiKuWXQgKzXR5zBMdOnjBnUYOgxlfxt5mixnvnAxdPFspN0nQvrvvtMk6fXWJFMKu2TO+zMhVJ541C6bWLVKvT28/j82ToPmPJa2OUz9kfPdIC2XeHoXkhg9Pm6Grp/YGPv1Ve3DjbudX0d2A/liBfv92Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bKutQv0///4j/gs5pQy7gCHsKQUJG1y4JVrMd8eMQzU=; b=rejdJWXspAS2WMBZKU9Y+KknN+3zwLb7ie3K/4mOIQz+d4Q5TxBsULmha28CM/Ju90XqcAvvd1ALJ0mQXao0vnpOZhE1xT5ApFC8bESuZB/TQYpp/z/cOcICZdXGVFJ/+UF6L1lzwGNwU/O/k9NCS8+dybbekCLYZ4+43ZNWZo4tqbwbjEKubAx6X6/A6Vfvq39FIfYym2vUo5WvViTkGUx0fs4sI2DX4qiDRe1UDN8L97PlG1k2vHsf1mBJA9awIBYpXPkG7000/REvsfcbPXNjARRQGRbVktys2unb0N+y5tBYUIxeH3kYGwVrZDe2gYPcBMRE1DlHKodlbf4pYw== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by AM8P250MB0374.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:329::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7113.22; Sat, 23 Dec 2023 11:45:18 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::f59c:9cff:a42d:bde]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::f59c:9cff:a42d:bde%3]) with mapi id 15.20.7113.022; Sat, 23 Dec 2023 11:45:17 +0000 Message-ID: Date: Sat, 23 Dec 2023 12:46:47 +0100 User-Agent: Mozilla Thunderbird To: ffmpeg-devel@ffmpeg.org References: <20231222011549.16057-1-jamrial@gmail.com> <20231222011549.16057-2-jamrial@gmail.com> <20231222230834.GH6420@pb2> <20231222235259.2328-1-jamrial@gmail.com> Content-Language: en-US From: Andreas Rheinhardt In-Reply-To: X-TMN: [dP1hVvCn7kTdiJUl+CAUC3GYj24Uu9evcvz7cJ3IdlI=] X-ClientProxiedBy: ZR2P278CA0039.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:47::16) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <591603a6-460d-48f0-b529-76be110ad9c5@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|AM8P250MB0374:EE_ X-MS-Office365-Filtering-Correlation-Id: 95dbd924-679e-483f-9307-08dc03aca1ff X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: e8/ESH2OGMSnknmadOKbJc0lGeCEaf+GdN/z1b/byQI5Z65j8uKE+rfsyU/+xO43De9spKzhVl5ZGDV8OWPEdOmUZljkM7SRvXfXGcG2cs64ELWmSyRpnXHoQmtbaX1K7SGuGVU5uWEqrb4vc3sL25aA5tM60i8AnBuH9xnia7IAzeq5MmEXozxZoBFvRbN+2zQY3NME0L0WW63oTEEjta3eUlwUc9JAtgtS/U81jkf/lKvWxdYy2LdktmmlSP7EesN0SuUliIJDPrAE5+rp4RsvuANLS2tL0Mh0IbLxInVKnJTqVYP2pM2vUw/dVyEOyOUPlrhPIWmVUzqGBQGMy2gGDjxPZ9MAvAucDRm2SF5zJkSrhIOCoxy7AV9c7NNSKfy8pc9bqn5X1ZVykfw1iicaKvIIncCMtceaZR/uiPeolRFQIx4Jua7t/EOV3GrFM+in3yX64TmpacV88acLetH32rn6LkCGzqXFGlf0gD28T2Luv3MZYVH8IW77D+r5bR8XJykZWqB6m8oKxFRbzhg0m6HP7oTdQHLCwa6tGbxqrU8/jFR1vF3burtRPVHH X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?ZTk1cTVwUTE4M21LMHU3aVVTOXdTTWhPZDZNYXcrVHZhYmtkYzZldU1LaTJF?= =?utf-8?B?bEVIMGh4NEZUbnV3YnhQeW9IbElFeDQveUxTUEVsUUQydzhzbXlWMklGbWxX?= =?utf-8?B?NGRLd0JWeE1LWmRDQzJoTjdZKy9vQkhwSVJPL3NpSVN4YTl0YUtOY2hNRC9Z?= =?utf-8?B?T0Z2bVhTOU04bzFhOXdOUUZQTzhoU2R6Ti9zb3ZBTEY0aUVVTTZmelRmUk9P?= =?utf-8?B?RjdXdXc2bHd0U09MTlhTMVFFSnU2a2tSSFI3SmZnZDZXTVVwcGhaL2RQblhM?= =?utf-8?B?SkpGSGg0VVRCT0RvZFpINHRPcmJEMm1jK29tMXZFeFFuVlZDdjlMQlMwYlFS?= =?utf-8?B?QWVucmRrNlVNRGVTbG1HL3BFVWQ3TVJ3aG13SHJJTVZpd3ZtYmlpWWw1VTFD?= =?utf-8?B?d1REK295SlcxdjhjbDJ3ckllMlhjOVlBb3FJY3pISkFma1RCUmNwbkFQSEJU?= =?utf-8?B?clAxM0YwRlpRbGl6bitpdFE5Z3lTRDJiNDYzS1pYRXdFMGZvK1MxREpQQkFL?= =?utf-8?B?NERXaldBaTdaYytQUEdmZWJ3Sk80dXNXbExVZ1kvVzJNZ3VKa0NGbXd5c055?= =?utf-8?B?c3RRL2U2blZLMG4yR24rN2RBSlFnVnBseW94Y2Qwd2JRRzhDRlExaFkwSkZi?= =?utf-8?B?N2lHVWgySTNHbVBtbzkwZCtiWVZyWHhCcXNUWC8xVTRrTGJlc3FwR2xoT3RH?= =?utf-8?B?Rnl3Y2lBWGh1YkcrK3dkZjlNcVA5b2IxcWp4ak1XT1dWL01wUVFwb09lSmg2?= =?utf-8?B?UWFROFhkQnRUSzVzcHFLMmUrNGZ0Ui9YV2E4dnppUmZCK29qSERNWjVvMlJE?= =?utf-8?B?NEhHc1kzdUY0K0M1WVc3U1pyeGdXbnlycWwzSUhBWVYrbXpZM1F1Y2FkbnZn?= =?utf-8?B?MUxMT1lKN0ZIQ3h4WFVMdys4K21YZkd5QW10QlJwTWIrOHdiZVBtQXNweFBU?= =?utf-8?B?cEFQWE9lL3ZxejdwY05pLzZUV0ZOR0hPL3dGUDYxMlk0Qlp1cWd2dVY2UmpN?= =?utf-8?B?V1VGK3I5QVJVeFpIOFl6WUJEZzhSb2N4YlEvYzZ4alh4UFZGSHludTcxdkVG?= =?utf-8?B?cDRjRk9BV3RWTldad1RnOVJTaUxlanRHMTVrTUlJS2llckpqRHRJR2UvZ1Uy?= =?utf-8?B?eUhtN0lhNEEzd0ZSOWhOYkkvdzlPK25sZC95MjFLVEZOOXRXUzEzOGNFTFhz?= =?utf-8?B?VldoZTF2L3I0Q3RHU0VpVk1Va0Nib2RHMHNtL05GeklXbUF0aGcyRit0djZM?= =?utf-8?B?OXl3cjVSOEtWNk1EOEN0L2NMd1JUZ2o5KzJ4aXMzV2JpOXFucXlaSGNja1c1?= =?utf-8?B?S0lUcVUzM3pQS0F3VXM0V29xWktqajlLWHprMFV1Rnh5SGZiRWJibW9mWGVP?= =?utf-8?B?aXFhWTE3TXhGaHQzejlOSWx4ZFFqTTlYQ0VjSGFvWElDSlB6ZWVHR0FISlFC?= =?utf-8?B?S3RIcjRVblpmbkZLbTJra29TMnV1MnBNNHdJTksvMldLUDZWdWNBRmF5MEVV?= =?utf-8?B?UFpaOFUvZHR0OXlZRXBheVJRLzlEeDZtV2VsNk5CK2xIREtVdmlLY0ZsSmVP?= =?utf-8?B?S2Vlblk5bzRhVnpNM1dhbHpOUmZEVTZpSDlRaXpXYXUzV2hxa29BZG51aEJi?= =?utf-8?B?cDUxNE5RbWkxVzd0MnZlNUQ1NE5ZZnVYanpnZWlUbk8yRmZURHU2MmNOaGNp?= =?utf-8?Q?HVR9WEVQg/pPx2soNqWR?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 95dbd924-679e-483f-9307-08dc03aca1ff X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Dec 2023 11:45:16.9734 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8P250MB0374 Subject: Re: [FFmpeg-devel] [PATCH 2/2 v2] x86/takdsp: add avx2 versions of all functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Lynne: > Dec 23, 2023, 00:53 by jamrial@gmail.com: > >> On an Intel Core i7 12700k: >> >> decorrelate_ls_c: 814.3 >> decorrelate_ls_sse2: 165.8 >> decorrelate_ls_avx2: 101.3 >> decorrelate_sf_c: 1602.6 >> decorrelate_sf_sse4: 640.1 >> decorrelate_sf_avx2: 324.6 >> decorrelate_sm_c: 1564.8 >> decorrelate_sm_sse2: 379.3 >> decorrelate_sm_avx2: 203.3 >> decorrelate_sr_c: 785.3 >> decorrelate_sr_sse2: 176.3 >> decorrelate_sr_avx2: 99.8 >> >> Signed-off-by: James Almer >> > > Even better on a Zen3: > checkasm: all 8 tests passed > decorrelate_ls_c: 111.1 > decorrelate_ls_sse2: 272.6 > decorrelate_ls_avx2: 94.1 > decorrelate_sf_c: 170.6 > decorrelate_sf_sse4: 400.1 > decorrelate_sf_avx2: 196.1 > decorrelate_sm_c: 187.6 > decorrelate_sm_sse2: 383.1 > decorrelate_sm_avx2: 179.1 > decorrelate_sr_c: 102.6 > decorrelate_sr_sse2: 272.6 > decorrelate_sr_avx2: 94.1 > The SSE2 version is worse than the C version? Does this happen for more DSP code? (For decorrelate_sf_c, the C version is still the best and the gain of AVX2 over C is not good for the other three either.) - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".