From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 6B30A43D2F for ; Mon, 8 Aug 2022 15:25:51 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9F8A368B6B0; Mon, 8 Aug 2022 18:25:48 +0300 (EEST) Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C45B668AFC2 for ; Mon, 8 Aug 2022 18:25:41 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1659972347; x=1691508347; h=from:to:subject:date:message-id: content-transfer-encoding:mime-version; bh=IVBMjDZqlWrwEmwaK3QpdNyPkflSvEzvGsa9OSJI9HE=; b=k0k2+QBPZxF1KHMrOxSl1s0MyXUQ3pxH1mHpS4cez88jIlCHCH5p52HU a1Zuc7ETtd0zLRWXVfsaJlmFXRg8K383b6gyN4TwO6ok/OqZ3zvh+fL2K T8J3RCEzuptj7Cxnsdray6kCz8Nf/HdN3S7FAcJDZbHY1yOtfAhyzSlqs c=; Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1d-9a235a16.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2022 15:25:40 +0000 Received: from EX13MTAUWB001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1d-9a235a16.us-east-1.amazon.com (Postfix) with ESMTPS id BE1FB8012D for ; Mon, 8 Aug 2022 15:25:37 +0000 (UTC) Received: from EX19D007UWB001.ant.amazon.com (10.13.138.75) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.36; Mon, 8 Aug 2022 15:25:36 +0000 Received: from EX19D007UWB001.ant.amazon.com (10.13.138.75) by EX19D007UWB001.ant.amazon.com (10.13.138.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.9; Mon, 8 Aug 2022 15:25:36 +0000 Received: from EX19D007UWB001.ant.amazon.com ([fe80::bcaa:e18f:a569:3851]) by EX19D007UWB001.ant.amazon.com ([fe80::bcaa:e18f:a569:3851%6]) with mapi id 15.02.1118.009; Mon, 8 Aug 2022 15:25:36 +0000 From: "Swinney, Jonathan" To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH v2] add a configure flag to enabled tree-vecorization with gcc Thread-Index: AdirOxsCLcOc98oYRXW/6S2Rl1FFDA== Date: Mon, 8 Aug 2022 15:25:36 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.43.161.236] MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2] add a configure flag to enabled tree-vecorization with gcc X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Recent version of gcc improve the automatic vectorization. This flag allows adventurous users to enable vectorization. Known problems with this are primarily related to inline assembly for x86 and so to address those, add a pragma to explicitly disable automatic vectorization for those files. Signed-off-by: Jonathan Swinney -- Thank you considering this patch. I believe this addresses the primary concerns that were raised by my previous submission. There may be more files which require the pragma add `-fno-tree-vectorize`, and I welcome suggestions. This should strike a compromise, allowing some users to enable vectorization while not breaking mainstream builds. This should give time to work out additional problems if they arise before enabling vectorization more broadly. --- configure | 7 ++++++- libavcodec/x86/cabac.h | 4 ++++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/configure b/configure index cbbb4dd9c8..8e842da1b8 100755 --- a/configure +++ b/configure @@ -110,6 +110,7 @@ Configuration options: --disable-swscale-alpha disable alpha channel support in swscale --disable-all disable building components, libraries and programs --disable-autodetect disable automatically detected external libraries [no] + --enable-auto-vectorization enable compiler auto vectorization Program options: --disable-programs do not build command line programs @@ -1945,6 +1946,7 @@ FEATURE_LIST=" small static swscale_alpha + auto_vectorization " # this list should be kept in linking order @@ -7176,7 +7178,9 @@ if enabled icc; then disable aligned_stack fi elif enabled gcc; then - check_optflags -fno-tree-vectorize + if disabled auto_vectorization; then + check_optflags -fno-tree-vectorize + fi check_cflags -Werror=format-security check_cflags -Werror=implicit-function-declaration check_cflags -Werror=missing-prototypes @@ -7569,6 +7573,7 @@ echo "pod2man enabled ${pod2man-no}" echo "makeinfo enabled ${makeinfo-no}" echo "makeinfo supports HTML ${makeinfo_html-no}" echo "xmllint enabled ${xmllint-no}" +echo "auto-vectorization ${auto_vectorization-no}" test -n "$random_seed" && echo "random seed ${random_seed}" echo diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h index b046a56a6b..782e4cbda4 100644 --- a/libavcodec/x86/cabac.h +++ b/libavcodec/x86/cabac.h @@ -39,6 +39,10 @@ #if HAVE_INLINE_ASM +#ifdef __GNUC__ + __attribute__((optimize("-fno-tree-vectorize"))) +#endif + #ifndef UNCHECKED_BITSTREAM_READER #define UNCHECKED_BITSTREAM_READER !CONFIG_SAFE_BITSTREAM_READER #endif -- 2.37.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".