From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 023B943C8A for ; Thu, 28 Jul 2022 01:15:30 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 48B0F68B95B; Thu, 28 Jul 2022 04:15:28 +0300 (EEST) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10olkn2080.outbound.protection.outlook.com [40.92.42.80]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 865F668B7FC for ; Thu, 28 Jul 2022 04:15:22 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AIMBWLxO+xxhofWavAkTabmpzQEdX8rVsSC+PLG1mJqBl6/sekMs2GVdn1pznV2pXqH5k9ANVnHUmThVyANzaLySHH+RtUoAN7hew+j9oQhJ9beAs7p1XxVjSIKqhusWc9eWfdcjXHNTiURg9g7CVidMl3GyNu2GW4W+uVMjEBmDEFfG4b7JT8S4mwaSehV0v2veP8Uwx6fqJDaBCDMBWt9XLUJ92g/veDo6/OQqELd+o4YsnCL4SiM7UoIuz0hc6EAwNih5nR3ijUeCMCoJc0sAIiBWhny1/e4EV2LnnrzHxfmosOgMYQZEfYqg1/Yn6vsjuCq4JPPne1grh7juQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=B7kgUT9d31DT3PX3/D99YbZQ95FiQW77s3gxK9PzFYc=; b=jFT/NL0U+nsYMuObhpLEzMQDZkbguXa7kFj37Oj9GIzqwnxXocD2ZazgqN/+Xn0wWi1x6zdBY2uCPBeoh1hvNiSvSlW3JfDU/yaaoS+4PNqn1KtzEuSKtEMKYkiCdtShMmRd1ze9jDWdNzCuUmzdjMvr0lwMgDJpRZu6MmcyXZRMgR3zJdM8qv0ZVYLeopMeVhAnvAjmeAP7hgkAyCXiX7V5fT/CpDotohRi5MPvvz5Igr1jYhuyIBRsAg0bCS6ebIE1CyyNZicNz0ikgPIV2Aa5zFKlCaXQJ0w2AnzI5PtEOWHXX21+9U7x0DaMiBfS+jxPBPk8H4S995yGDerWCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=B7kgUT9d31DT3PX3/D99YbZQ95FiQW77s3gxK9PzFYc=; b=kspZnI6n/bSg69+Qo83AN6D05V4FxLb5ShKID+kORo6ke4HFQfY8zhf3DW7OAtAyWZOxadKlVFA7YBJE+7YSa5EJENIbICZX/Xe3AXqOHfgWH3mP4zbQ/zw65t9DCWc4NDjUuN3ONNRoqTGANYM3YpjyQGwEcuY1UqdWKjAP8RHQYqA+jG7agp7v4ks5AEmfAZLzkjbz+IG5Hc20U28hej7vDv04v11BsrjPwm8r5/BdHXuCJ7lKitORvrnpM3ejbCEv3wtzC1+2IUydmgXhEhqsmrymn7ii3D09/b1+zYfsAmC5smnpA5yCrdBbGVnID2yc+yBAHEUrNJdCcdTfkw== Received: from DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM (2603:10b6:8:b::20) by IA1P223MB0475.NAMP223.PROD.OUTLOOK.COM (2603:10b6:208:3e9::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5458.24; Thu, 28 Jul 2022 01:15:19 +0000 Received: from DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM ([fe80::d9e4:ced6:ab31:c231]) by DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM ([fe80::d9e4:ced6:ab31:c231%2]) with mapi id 15.20.5482.006; Thu, 28 Jul 2022 01:15:19 +0000 From: Soft Works To: FFmpeg development discussions and patches Thread-Topic: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher Thread-Index: Adih3iCGbsai8yFERGe52y5L0WuErAAAa3gAAAZfh4AABnHWAAACv4EAAAAjJvAAAC7V4A== Date: Thu, 28 Jul 2022 01:15:19 +0000 Message-ID: References: <05a46152f1b2458ea326edd9cfb6d817@amazon.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-tmn: [wK9xmh31b6SE+pZoEwrdXAfrTHc+6Amr] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 4f96aba5-1f26-4a57-7ab7-08da7036a318 x-ms-traffictypediagnostic: IA1P223MB0475:EE_ x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: YY5KU1zYKsEyF6of7vk8C/IXwt3dAGFqDKZPHYg6cjjVfXoz4ly5TX+NjI8MaLV4xXIsYlZNAzAyWrRW5TWILNCHwZk58fJEN03fVokP0bi7sgd6Z3hDttvk8NiHAklAezVmPv8FOKomqONpI94ATKVJ4DvJWnCtyv+fJg2rOrX0m6zyItHKkLnP9qP2t5rR5vKCfNE1poyhcQDhdtLWylnZjMDT3aJEWfhY4cBD+OCf+xGucHsJc7msjqZri8Hv+Gbguvy9D1t/ym1/uqLAQ7QIgO4kdlVyXzO5DpCEcljZbDDyuznxeJlZgOaplYOlj7VGCcIIJ6g0kZ5BWAYzpJPzDSi2djVTI5H7LP+nsPRcHJt4JEpdEV5aS/ezeZO2v+lupeaNN5WS8FVKNJ/11/a1Py+myyJr8F/1KIbrv/TV3QfrKa3oB9DiFtsVKrpDDXC8/rHNId1SyvNLYXx0ymx3sA3onfnB/Y2RJkOPl4RK3ZBhC7FgBNwDaQdECqNoCIbWJex+2AIDcuJDE5jXpSbR0Ru4DVxnA6CuZeKSmPdUUmGLbc/k2nGQUQYj9QLLzUPk3VfkuXvVqyPBm9fKfRBprZZeMmPbcL+3KpqkYWv+dDO1Uhir9E/sMgiw+426hTlOjWVoR2EWvsRnaGGsjRpDCWXi1IRANaL5wYMKMfPEC3w+zaDgsjM9q+2ElI6K x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?B?cTBzNG44MW5DcW1jVkMvamI2eTFRdEJ6R2lKbGVldHBtUUkzbzBHV2dDVDVv?= =?utf-8?B?RlpWVTIxYU11VXBoTEhlQmZTNUVrYXl2UGdUWEZwNkFjKy9WZGY5eFRVK1NL?= =?utf-8?B?YUlwQWRheXpkN0tNOG95TlhFY3BEVGUySVNDbGFxMjR3MGduR2l6RzkvdU9i?= =?utf-8?B?M2ZiZU9rME5IamtLaGZRRkR2ak1odUVWRGNicG8wYUk0aVd1TjBsdFBCNnhI?= =?utf-8?B?NnpRampDc0pyRktEaEV2NGhwZFBxaUhvc3lHM1o5WmVwUE5xSUFoWGlXTWtJ?= =?utf-8?B?VWRLR0Q3enFUbVZMM29JK2xSRVFvOWFtczE5QWtuQTlkSTRNc2hNSlZESXll?= =?utf-8?B?MXFnQ05ONHlmZnN3bHFWTHoxdDBneGhTa1JRSW9nY3hFYjlWa3pUMnk1Q3ZZ?= =?utf-8?B?ejVrZWlkWExEN3hvWjVBd0NsemJCY3Nyb0l6ZmZ6VXFZYmMrYnhpbXdwVWVM?= =?utf-8?B?M0dnK0g3TzdHYXdkcjZFZEIzRGx5eGNzUThkVVdQZmtRbjVRZmxKMURUY1B5?= =?utf-8?B?MzZ1enV4enlwQkdUcjRoVis1bVRFeVp0ZVEvYktqblF0KzVyRnNIVlVPL0Vo?= =?utf-8?B?OXIrb0NuMk5mVmxoRXVWaGp3OWJGVUhtRHh4YXJ1RVNWWUJEckxJNEFwaWlx?= =?utf-8?B?MUlRWVo2OWE0a2hQL2pic3BWN0ppUVBXV3d2bUduajY0Q2xjVmpUSWVWTkVQ?= =?utf-8?B?dS80UEpTZXBid1lhckllcmROOUdSeHFkVndtZTN3STlvZTJzbmtBUGZ6dkty?= =?utf-8?B?YURaRGlXdWI3ZUluODhDMEE0bjBTSWwza1N4dTJaRkhjNXc2cUgvNVJvSmpL?= =?utf-8?B?akQ4NVEwcXM3bnhtRnk4bUNvd000dENVVUppSGdDRjNSQzEwOHFpUlI3azVk?= =?utf-8?B?S0poZ2RodHE4Q3c4WTB6L2xCYjRGL2Izd3k5R1ZhVGFSS2h6c2FGOHA5ZXg3?= =?utf-8?B?cVRHU0wrelI1YkNaM05aT2YrZXo0Z01EOGlZTlhobk4vbmxJNjJuRUwra3Y3?= =?utf-8?B?dnBMdDVaMXdwUE9nbWhQWkFJdmJJSzZoRFJXek4yWS9nWVlXeEcvdHhWb1p6?= =?utf-8?B?blF1VERYalJ5dDBFT3ZHWWFMT0RyQVNPb04vZWwyRFhPMzc0YlUvSy9BUGh1?= =?utf-8?B?M3JJTFpOdFZFdFNDcFNJbnhKWCtyQldDRzNpY2xDTzhFbzNmMUpxdDE1SzlK?= =?utf-8?B?RFpuQzgxeitXd3FlaE9HNnYrTWlBZmRaa29FdnREa2FuUFhuLzVzbk1NU203?= =?utf-8?B?T3dCa0dQN2xHWlRGWG5uYkZ3Qy8vRUt2RDV2QTJZU0xOaUhqd205MmgrU3B6?= =?utf-8?B?djNmQkk5VkM0US9NR1ZzWjdpY1ZQMlhQUnA3ZHplS2FVVVdMTXkvRVV4M1hW?= =?utf-8?B?RXM5SGpkS2ZPa3JkYzNwVHdzWHJmak9NNjlxTnEzSWZPMDl2RHVqQ3U0S3Rh?= =?utf-8?B?MFJicGlBK3gzcTE1ZFloMWtIc2NpakR5Tnk0MnlFN2xMTCtIbmd1NXlQSGpN?= =?utf-8?B?ODFmeXFhY2RTZjNRWGoweVJDWDFvZjRNRW5Wc0hMZW9IQWNwbHdSRUVwTGdm?= =?utf-8?B?VXEyYnY5NVUzZlp6R3ZKRmx4V0M4UW9OR3pmS0tURjA2ZllXTFNrSlFCMXJV?= =?utf-8?B?TkFiRUp1NlpXcGFCc2VVQmIzVEFBNFJaMmxhM3JQczMvc1l3Ty9Od2tId3ZI?= =?utf-8?B?QlNITmtKbFdub1FkeW80UjFYdUQ3SXVzVmk1WkRxN0RabzdiS3Jaa213PT0=?= MIME-Version: 1.0 X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-1ff67.templateTenant X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 4f96aba5-1f26-4a57-7ab7-08da7036a318 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Jul 2022 01:15:19.2557 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1P223MB0475 Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: > -----Original Message----- > From: ffmpeg-devel On Behalf Of > Soft Works > Sent: Thursday, July 28, 2022 3:11 AM > To: FFmpeg development discussions and patches devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc > 7 and higher > > > > > -----Original Message----- > > From: ffmpeg-devel On Behalf Of > > James Almer > > Sent: Thursday, July 28, 2022 3:05 AM > > To: ffmpeg-devel@ffmpeg.org > > Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization for > gcc > > 7 and higher > > > > On 7/27/2022 10:02 PM, Soft Works wrote: > > > > > >> -----Original Message----- > > >> From: ffmpeg-devel On Behalf > Of > > >> Hendrik Leppkes > > >> Sent: Wednesday, July 27, 2022 10:42 PM > > >> To: FFmpeg development discussions and patches > >> devel@ffmpeg.org> > > >> Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization > for > > gcc > > >> 7 and higher > > >> > > >> On Wed, Jul 27, 2022 at 7:39 PM James Almer > > >> wrote: > > >>> On 7/27/2022 2:34 PM, Swinney, Jonathan wrote: > > >>>> I recognize that this patch is going to be somewhat > > >> controversial. I'm submitting it mostly to see what the opinions > > are > > >> and evaluate options. I am working on improving performance for > > >> aarch64. On that architecture, there are fewer hand written > > assembly > > >> implementations of hot functions than there are for x86_64 and > > >> allowing gcc to auto-vectorize yields noticeable improvements. > > >>>> Gcc vectorization has improved recently and it hasn't been > > >> evaluated on the mailing list for a few years. This is the > latest > > >> discussion I found in my searches: > > >> http://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html > > >>> Every time this was done, it was inevitably reverted after > > >> complains and > > >>> crash reports started piling up because gcc can't really handle > > all > > >> the > > >>> inline code our codebase has, among other things. > > >>> > > >> No need to wait for issues, I just tested, and the same issues > > still > > >> persist that have existed for years with GCC now. They don't > seem > > to > > >> care to make it compatible with inline asm, which might be fair > > >> enough, but it means it just can't work here. > > >> > > >> In file included from libavcodec/cabac_functions.h:49, > > >> from libavcodec/h264_cabac.c:36: > > >> libavcodec/h264_cabac.c: In function 'ff_h264_decode_mb_cabac': > > >> libavcodec/x86/cabac.h:199:5: error: 'asm' operand has > impossible > > >> constraints > > > I wonder why it doesn't fail when I try the same on MINGW32: > > > > > > gcc -I. -Isrc/ -D_FORTIFY_SOURCE=0 -D__USE_MINGW_ANSI_STDIO=1 - > > D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE - > > U__STRICT_ANSI__ -D__USE_MINGW_ANSI_STDIO=1 - > > D__printf__=__gnu_printf__ -D_POSIX_C_SOURCE=200112 - > > D_XOPEN_SOURCE=600 -DOPJ_STATIC -DZLIB_CONST -DHAVE_AV_CONFIG_H - > > DBUILDING_avcodec -mthreads -DLIBTWOLAME_STATIC -std=c11 - > > IV:/ffbuild/mas/local32/include - > > IV:/ffbuild/mas/msys64/mingw32/include -I/mingw32/include - > > IF:/ffbuild/mas/local32/include -DLIBARCHIVE_STATIC -Wdeclaration- > > after-statement -Wall -Wdisabled-optimization -Wpointer-arith - > > Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef -Wmissing- > > prototypes -Wstrict-prototypes -Wempty-body -Wno-parentheses -Wno- > > switch -Wno-format-zero-length -Wno-pointer-sign -Wno-unused-const- > > variable -Wno-bool-operation -Wno-char-subscripts -O3 - > Werror=format- > > security -Werror=implicit-function-declaration -Werror=missing- > > prototypes -Werror=return-type -Werror=vla -Wformat -fdiagnostics- > > color=auto -Wno-maybe-uninitialized > > - > > > ftree-vectorize -MMD -MF libavcodec/h264_cabac.d -MT > > libavcodec/h264_cabac.o -c -o libavcodec/h264_cabac.o > > src/libavcodec/h264_cabac.c > > > > You didn't set CPU to haswell (Which will add -march=haswell to the > > command line). > > Yup, you're right - this way I get the same error as Hendrik. Thanks! > > But then, when changing -O3 to -O2, it's compiling without > error again. Adding #pragma GCC optimize("no-tree-vectorize") to get_cabac_inline_x86() allows compiling even with -O3 (the attribute approach doesn't seem to work). softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".