From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 527D243C54 for ; Wed, 27 Jul 2022 17:35:07 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 93D3468B8D9; Wed, 27 Jul 2022 20:35:04 +0300 (EEST) Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DC9DC68B830 for ; Wed, 27 Jul 2022 20:34:57 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1658943304; x=1690479304; h=from:to:subject:date:message-id: content-transfer-encoding:mime-version; bh=QhHD0JXHikHvxz7YugNgQS8G7bp2D8s4YP84M6WpkSM=; b=UjCeKLkDyFTmbEhndlANs04sY4XTqRqDuMVN1qw/sy/01yPZKfafOTXO POPwi0gP2FNT+rcSwH5T2C7TtOiafXqyIEqk0eX71HmthnEaG1RBjVaP4 usfdn6o8Br6Z/3eBD7zniG/dgRKhcmq52OF6EJGkDnCFPGjvo/vVfl9qh M=; X-IronPort-AV: E=Sophos;i="5.93,196,1654560000"; d="scan'208";a="112944058" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-iad-1a-b27d4a00.us-east-1.amazon.com) ([10.25.36.210]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jul 2022 17:34:55 +0000 Received: from EX13MTAUWB001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1a-b27d4a00.us-east-1.amazon.com (Postfix) with ESMTPS id 9536782663 for ; Wed, 27 Jul 2022 17:34:53 +0000 (UTC) Received: from EX19D007UWB001.ant.amazon.com (10.13.138.75) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1497.36; Wed, 27 Jul 2022 17:34:52 +0000 Received: from EX19D007UWB001.ant.amazon.com (10.13.138.75) by EX19D007UWB001.ant.amazon.com (10.13.138.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.9; Wed, 27 Jul 2022 17:34:52 +0000 Received: from EX19D007UWB001.ant.amazon.com ([fe80::bcaa:e18f:a569:3851]) by EX19D007UWB001.ant.amazon.com ([fe80::bcaa:e18f:a569:3851%6]) with mapi id 15.02.1118.009; Wed, 27 Jul 2022 17:34:52 +0000 From: "Swinney, Jonathan" To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH] enable auto vectorization for gcc 7 and higher Thread-Index: Adih3iCGbsai8yFERGe52y5L0WuErA== Date: Wed, 27 Jul 2022 17:34:52 +0000 Message-ID: <05a46152f1b2458ea326edd9cfb6d817@amazon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.43.161.113] MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: I recognize that this patch is going to be somewhat controversial. I'm submitting it mostly to see what the opinions are and evaluate options. I am working on improving performance for aarch64. On that architecture, there are fewer hand written assembly implementations of hot functions than there are for x86_64 and allowing gcc to auto-vectorize yields noticeable improvements. Gcc vectorization has improved recently and it hasn't been evaluated on the mailing list for a few years. This is the latest discussion I found in my searches: http://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html If the community is not comfortable accepting a patch like this outright, would you be willing to accept a new option to the configure script, something like --enable-auto-vectorization? Thanks! Signed-off-by: Jonathan Swinney --- configure | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 6629d14099..c63c9348ad 100755 --- a/configure +++ b/configure @@ -7173,7 +7173,9 @@ if enabled icc; then disable aligned_stack fi elif enabled gcc; then - check_optflags -fno-tree-vectorize + case $gcc_basever in + 2|2.*|3.*|4.*|5.*|6.*) check_optflags -fno-tree-vectorize ;; + esac check_cflags -Werror=format-security check_cflags -Werror=implicit-function-declaration check_cflags -Werror=missing-prototypes -- 2.32.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".