From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 01AF343C83 for ; Thu, 28 Jul 2022 01:10:56 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DC5DC68B974; Thu, 28 Jul 2022 04:10:53 +0300 (EEST) Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10olkn2040.outbound.protection.outlook.com [40.92.40.40]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C922E68B897 for ; Thu, 28 Jul 2022 04:10:47 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Dkkzq/rr1nRNgP37RSQbo1ARod5h8KklZ6jU/Z0ISTju3uMURBKlkCPNbN0wcoM+8PVjW0/zNGbeFmZNI1N1IHNH/QHbaySWZDk5e3MXUdEpYX99wrHgylqCXesXLxK+GSDmRL1MAsQAtRYTY8l46mwwEfpy8WlcMqLRrhAwKu9xJBLWFG7kyOVIcGTdqm9kX7EGhIeQ3cMl8ozhAHnikt4rANPbmKmYu3K3M2tDfWSJNN9VXbVwqCkGc3IoejiIT0O+gevzCfRvh6FEgN/sAwrRQl9sOnw/qO3Wpkrv4CT32yxfoewsVjjClr7tv3o3M74sVA7t7lqVH4YxBuxSFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7wZ4ThOItdpcNYipVBtw67i1PWForX202MEN7Tszoqs=; b=U7DY1OwxV6HjzkvVeAd113KBrnOUxgajMbtTNsdL2V0A9CioTNwfWQWSgFTIR+35XjWjqUOdFbClcF0aKdgNHVFG7rjPyRNfjxwT0Jmk6hHTaxz+vDxNMfCiQLLxY/FZlIE0UItKPcCf6Q3TnJc3CrN7ZueTR52kI2pUgE//elV2v/xVwFKg1iRcoNQSgJtRRN5fQwxbRzR1UtBAs6p5HLKhJYL33/65xU1LWVCjkh+nDN1RsPRXulxQ3owHzfqU/BW2qUOE3a+2TLtOrphjybcc59mo2koHQSrGX2KczhltRdTnvCmDACn7cOnfdSER64tDkJIsK8GVr8zCmKAU6g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7wZ4ThOItdpcNYipVBtw67i1PWForX202MEN7Tszoqs=; b=bohAOi8S6wf83xCW85BUFZXUl5TVqHLlYkCH/Inm8evgV35Y5mJnw1C7mVPrUh3C5irtQ2MSKLfiHpOei8/bLnY/5gKkPcQWkg2trMOkIOJ20kq4R9vCSwbViuRJpcOmx2OvNy39ooTRpjwJJlGR1DrmJqEZNRSYpRCwDtWZT2O/4YvE16w40y9G5cMbNxdylA4d9HLZDqOYN9ED2+VNam5LGizjx5kRWhrLQOsK0ZQI5EnXI7OTMlBoxVc0Y1l/aXIAwOvITrLvat5iiLBPGUhYh/QNYdlyXIennaGm/yOPFPM3ccSIgMgXiKRoC0yRpApczjpsu9xwPI3zlHr/xQ== Received: from DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM (2603:10b6:8:b::20) by DM8P223MB0079.NAMP223.PROD.OUTLOOK.COM (2603:10b6:8:5::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5482.6; Thu, 28 Jul 2022 01:10:45 +0000 Received: from DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM ([fe80::d9e4:ced6:ab31:c231]) by DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM ([fe80::d9e4:ced6:ab31:c231%2]) with mapi id 15.20.5482.006; Thu, 28 Jul 2022 01:10:45 +0000 From: Soft Works To: FFmpeg development discussions and patches Thread-Topic: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher Thread-Index: Adih3iCGbsai8yFERGe52y5L0WuErAAAa3gAAAZfh4AABnHWAAACv4EAAAAjJvA= Date: Thu, 28 Jul 2022 01:10:45 +0000 Message-ID: References: <05a46152f1b2458ea326edd9cfb6d817@amazon.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-tmn: [gF9EALrme6bwdlsaEedEtr4wJn0BkvAk] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: ff804186-1049-405e-8801-08da70360003 x-ms-traffictypediagnostic: DM8P223MB0079:EE_ x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: +eBnmEIvQZSLMWCtvqC88dqm5UHk82UDjq0U4v6RRjK67w/hJuverlWgN2EM6ZJyoGwBUsVhaGUow4gj0XLtXFJ/1RZxzs+54aeZwQWDywdRMkVnTKQvEsvuufo7UiBKhA533U653M3Q8lSpj8ppQt3JAAUHRMOuAtlP1VysMS7MIiuJoMPAcN/bwuzqs+DTGs3bY4/In0sBX8117wowB9Ua77u2GUu8M9Gh0It5mXAlhE7aqikWj+foU5oLOblaJ6Z83f2ZqDGIbeB+CA1t0CVMP0Q82Uy8ZgHJasFzQgViPIgkWM8Qqqvjh9ILJQd72XcZILmMfxpjLYthdJcF/H2MdqEQXq8CLx0rbHA+HIZNqpHO7FZhpZoH3VUbwxA2pPExXR/pcP9pBctlgVTdBVNsz6lLiDFVq9Y6QFJpmH2eRN/8vlYtiYNlfART1e5BB03gxtvIk8jSSvmArHK787g+WwlgFrRyZ1gFzxl7mtEqj6UF2rOqEF4fVEDxAfMuKAxNOs/37fKhJLC/u069esmqhfx7UNN7TXxxJoI8R0wYDTWRvv7IwyF4AYQRhN5/ck0aCDmqXT6ryFf9zvgODiVMWMBhPKFM+60pjFviTQKVknYdC75A4srbbS5e/0gNmHl6sb+IL6HxJ5UqQWny+ifxjCMR//Uk991xqq8ii4JrVSfnDXN26fkNMgU8NEVx x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?B?MnlpRmpyRm9WN3hldkJKeHJPOWF4QzE0aGpIWjkyVnEySWFURERadVhVUnBV?= =?utf-8?B?RHhnZEJ1VGtwNVd0MzA4NkJqUjZjSFFNNlhjRWlZVytjTnRPK3ZRRFFmdFBI?= =?utf-8?B?enh4YVZGZHNzbjhDVlNvekx0NmNTYVVnbFM1N0Rrb0dhay9aT3MyY3p3ekFl?= =?utf-8?B?b0RnVlc5MlFSSG1pOHhPRlQ2bGdTWVJYM1cwdHhlRnJ4K0t5b1lhS3RiSGJD?= =?utf-8?B?Wm1Ka1BNMW5XMmdOWHRKWGlIRVEwblQzeWE1bnZhK3NVTTNXR04vVWpMZFJW?= =?utf-8?B?K3VOb1QxNWxRaHRzSVliUEN4V2NiZ2h1emg0M3hUeElqdE5EVVpiZlhOVFJI?= =?utf-8?B?UkFoOGFYemxNL2tlSlF5dUVNbklSR050dVNDVmRlc2NNY3U4VWhZS0Z3Q2FC?= =?utf-8?B?aGlqdEZiU2tNOUhVL3JsTkQ3Q25CTUxKd1BTdFNtRU9VWm1oMm5XdkRqdWVB?= =?utf-8?B?S083RU1WbEQ3YUhvUU5qT2ZoMzN4TzBPMWlvcm5hVHNqWDMvdVB0QlVDSUly?= =?utf-8?B?TGhnbHU0MGxZSWUyOCtVTkRZNWw4Y3FFTnA2dWF6dVlsOVU3Vjl4YlRLcytY?= =?utf-8?B?czZzd2FzWlljL3lyakZPWEY4ZHB6aTRlQnVXUWxqUkVNdVlwVm84TW5RdzQ3?= =?utf-8?B?S3RpeHZqQjQ4WVczMUpLWG0wV2JNZWJkRU45SUpjMzY1M2hVMkxITXhXRTlF?= =?utf-8?B?SlRtSnhxbUs5M0xBV0lXdEV6TjZFTkhVWHF1WFFZSm1GclBnK3dKRFV2SGgv?= =?utf-8?B?SEhRS2o1WXN6NTF6a1FTbnZhSjlrVzdrRnd5ZngyY1NxQWNORmxXZ0h5R3Yr?= =?utf-8?B?SnExSEF1Uk9tYlFJY2pzTTNJakFhWGF1eGNBd0pHclNKOGhnWk9pSW1UNjk4?= =?utf-8?B?bHR1ODkyRkNCUkdCSCttM1FiMDJ0TGJMaW5abGJMbHN3c0g1SkJOa2UwRFIw?= =?utf-8?B?bzU2aHBlR0R2bVdTQ2d4RnZUQU1mM1dQTlljUHJCVnp6Rzg5UngzMGt5T2x5?= =?utf-8?B?OVkwbm90eVVwME55NkZGNnFTWHZZaXR4RnhMV00xTFY4SHR5NGFoNkExN0NL?= =?utf-8?B?d2FabE4zTUJIM2JLZmkza0JXWkZCT1dFc0ZneDBaU3BoTVIxczZjbGlhRzFS?= =?utf-8?B?c0RhbVFaemhkeTllT2hLQU5MZmhGT3dwcUZkL1I2aWFUdkpSZzdyYlg0bGd3?= =?utf-8?B?UEU4dXc1R1VjVU9FYnJ5cjVwNzkxdmlybWVWYzMvdjdTQVdRYy9UeWg5Y3NE?= =?utf-8?B?NForYU9EOG9JYjhHOXFaTTZ0SHFISWduZGlKTUh3Y3FyVWJ4SDh4a2NmVU00?= =?utf-8?B?MVFrRmVzU0ZFbGN6TVFvTDYzbnZneUdJVHFKVTdyd0ZtS1oyU3p2dm1hNXlo?= =?utf-8?B?WFkyS3N6Ly9XSWNja1pzbUNEUHV3WEg2bTMrTmYyVlBIYnF5K2tLcDUrNVd4?= =?utf-8?B?SGUzamM3ZmY1SHpZQkpHNmh2eER6Z045OGh4K3ZSS0U2SGhjOHFhbU8zK0Ji?= =?utf-8?B?RFpNcXNuVktBVVZKZllrWVNLTUIzelZGSmVPZHZVRWtQRU45ek5oMDZsdW54?= =?utf-8?B?VVlwR1FwemlOK1ErUVlzbmcweG84U3Rpd3poQUlrY1FXcWp5cnJHaW1uajNV?= =?utf-8?B?a2d2ZWtOT0tUa3QwYUhPc3hTbE5OWlBIdm1PZXM3Y2RwL1VmUWNpRklsTzEy?= =?utf-8?B?UStlNkF4bCtQWUV3ZWJDa1J1NmF0eUFYRzZKSGN5VklFYXVhQVZ5WGxnPT0=?= MIME-Version: 1.0 X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-1ff67.templateTenant X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM8P223MB0365.NAMP223.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: ff804186-1049-405e-8801-08da70360003 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Jul 2022 01:10:45.6810 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM8P223MB0079 Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: > -----Original Message----- > From: ffmpeg-devel On Behalf Of > James Almer > Sent: Thursday, July 28, 2022 3:05 AM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc > 7 and higher > > On 7/27/2022 10:02 PM, Soft Works wrote: > > > >> -----Original Message----- > >> From: ffmpeg-devel On Behalf Of > >> Hendrik Leppkes > >> Sent: Wednesday, July 27, 2022 10:42 PM > >> To: FFmpeg development discussions and patches >> devel@ffmpeg.org> > >> Subject: Re: [FFmpeg-devel] [PATCH] enable auto vectorization for > gcc > >> 7 and higher > >> > >> On Wed, Jul 27, 2022 at 7:39 PM James Almer > >> wrote: > >>> On 7/27/2022 2:34 PM, Swinney, Jonathan wrote: > >>>> I recognize that this patch is going to be somewhat > >> controversial. I'm submitting it mostly to see what the opinions > are > >> and evaluate options. I am working on improving performance for > >> aarch64. On that architecture, there are fewer hand written > assembly > >> implementations of hot functions than there are for x86_64 and > >> allowing gcc to auto-vectorize yields noticeable improvements. > >>>> Gcc vectorization has improved recently and it hasn't been > >> evaluated on the mailing list for a few years. This is the latest > >> discussion I found in my searches: > >> http://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html > >>> Every time this was done, it was inevitably reverted after > >> complains and > >>> crash reports started piling up because gcc can't really handle > all > >> the > >>> inline code our codebase has, among other things. > >>> > >> No need to wait for issues, I just tested, and the same issues > still > >> persist that have existed for years with GCC now. They don't seem > to > >> care to make it compatible with inline asm, which might be fair > >> enough, but it means it just can't work here. > >> > >> In file included from libavcodec/cabac_functions.h:49, > >> from libavcodec/h264_cabac.c:36: > >> libavcodec/h264_cabac.c: In function 'ff_h264_decode_mb_cabac': > >> libavcodec/x86/cabac.h:199:5: error: 'asm' operand has impossible > >> constraints > > I wonder why it doesn't fail when I try the same on MINGW32: > > > > gcc -I. -Isrc/ -D_FORTIFY_SOURCE=0 -D__USE_MINGW_ANSI_STDIO=1 - > D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE - > U__STRICT_ANSI__ -D__USE_MINGW_ANSI_STDIO=1 - > D__printf__=__gnu_printf__ -D_POSIX_C_SOURCE=200112 - > D_XOPEN_SOURCE=600 -DOPJ_STATIC -DZLIB_CONST -DHAVE_AV_CONFIG_H - > DBUILDING_avcodec -mthreads -DLIBTWOLAME_STATIC -std=c11 - > IV:/ffbuild/mas/local32/include - > IV:/ffbuild/mas/msys64/mingw32/include -I/mingw32/include - > IF:/ffbuild/mas/local32/include -DLIBARCHIVE_STATIC -Wdeclaration- > after-statement -Wall -Wdisabled-optimization -Wpointer-arith - > Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef -Wmissing- > prototypes -Wstrict-prototypes -Wempty-body -Wno-parentheses -Wno- > switch -Wno-format-zero-length -Wno-pointer-sign -Wno-unused-const- > variable -Wno-bool-operation -Wno-char-subscripts -O3 -Werror=format- > security -Werror=implicit-function-declaration -Werror=missing- > prototypes -Werror=return-type -Werror=vla -Wformat -fdiagnostics- > color=auto -Wno-maybe-uninitialized > - > > ftree-vectorize -MMD -MF libavcodec/h264_cabac.d -MT > libavcodec/h264_cabac.o -c -o libavcodec/h264_cabac.o > src/libavcodec/h264_cabac.c > > You didn't set CPU to haswell (Which will add -march=haswell to the > command line). Yup, you're right - this way I get the same error as Hendrik. Thanks! But then, when changing -O3 to -O2, it's compiling without error again. softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".