From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 49A3640C3E for ; Wed, 9 Mar 2022 07:38:21 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C342968AFB3; Wed, 9 Mar 2022 09:38:18 +0200 (EET) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8E01668AE56 for ; Wed, 9 Mar 2022 09:38:11 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646811496; x=1678347496; h=from:to:subject:date:message-id:references:in-reply-to: content-transfer-encoding:mime-version; bh=w7SmmaJgbpkz4Gb1ZYVKTvypQtMnVqN7n5W+FfUW0CY=; b=V2VJVLUxakq2xTv3WfpLvT7kHWTZc3Bbvax/hy7QBzPYIKAbdZCnklkH 3khabmKsJWk3IZepUmm9RjDQ64kVLFm0hYSNgOeAi0jaUVNcM218mU32I Xq0yNQGZgrQ3x6q2GvLPFWaPuT38JtVxR4arFKFkvkHLwQx9+JCJDaVLL bj6UDcGzdTZ6HfGxAygWqkSj66QSBef3czj6zHqAYl5u3CcCaQJeF9V/0 P1lRoy7oBsP33pcs9l+6uv08SY6C8KADnzOtjT8xxfNzfSoNThizqHqbA h/2lhDoBpmt4DHu0pghu1IQ0oQK4J+PoEJt4wUrN9sWUUOWhJf7jpLa+Z g==; X-IronPort-AV: E=McAfee;i="6200,9189,10280"; a="253741651" X-IronPort-AV: E=Sophos;i="5.90,167,1643702400"; d="scan'208";a="253741651" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2022 23:38:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,167,1643702400"; d="scan'208";a="495751984" Received: from fmsmsx604.amr.corp.intel.com ([10.18.126.84]) by orsmga003.jf.intel.com with ESMTP; 08 Mar 2022 23:38:08 -0800 Received: from fmsmsx607.amr.corp.intel.com (10.18.126.87) by fmsmsx604.amr.corp.intel.com (10.18.126.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Tue, 8 Mar 2022 23:38:08 -0800 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx607.amr.corp.intel.com (10.18.126.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21 via Frontend Transport; Tue, 8 Mar 2022 23:38:08 -0800 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.101) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2308.21; Tue, 8 Mar 2022 23:38:07 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RgloxUB5bM0Re33racPNjUniWHCrw6VeRlwu3F8u3ZzPrEMvclPRBVNl/Ex+YVwh9/HNcn5bBmEgDmRuMUrWSHZaoWPEJL2eoLYzb/qI7GuT0Xoyt8t5Zf/mkoxQ9XahYqtwqi3jmjuoHgFMAASomH4QnLZUKLNBgzZSztlkDMRrCCt1Ty/powTS/nAY5XmZg+scYmvrGkzfWpaY+Qn+O11oNKrzVkUmz2X4O/uIOTulyWf11BpLX0lW/V+aOvywBqn8WE3D6UHHefJLVVO9Y0TxtvtNL1vYeV2CzsvSubQj8z8AjzAlPNFghWgT0rD035swoDqyQQEt3LJeVqlhtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JwyFL6C57panaj/OnoT6BSosJ+WrKGqZUh1kB7Fs3hs=; b=hiODaHO82mijtlqYV6vZglZcNQrkeq6JlriN7XlpHu4fVM59HPEZvhTDo+ue0F3fkepS91Qje05CcWLpHODbsAUhpT0G54hU8bOANKc+DnTNeyPT6Yc1rATy0gszdbuskKLGwhFCuHILcwRCdjpa7SQXKuUj8Bbun67pdi1WtFZmaSXCAyDKXJi2OPMwmCqU6jNZGg7K5dY9xdBD71IMbAcmsKg7iZOz4RS+zk58EY3nM2jumkEWFDvQs+TEzCPUNkbYZ7EV524K+iHQUc1xaHtWcfBdcReYYCZa1kjG8Y2EYBad37jcHujZJ20eFia8NjGyAoDTHXHe/vvdxW4N7w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from CO1PR11MB4963.namprd11.prod.outlook.com (2603:10b6:303:91::14) by DM5PR1101MB2092.namprd11.prod.outlook.com (2603:10b6:4:5a::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5038.14; Wed, 9 Mar 2022 07:38:06 +0000 Received: from CO1PR11MB4963.namprd11.prod.outlook.com ([fe80::1905:6108:3846:e081]) by CO1PR11MB4963.namprd11.prod.outlook.com ([fe80::1905:6108:3846:e081%4]) with mapi id 15.20.5038.029; Wed, 9 Mar 2022 07:38:06 +0000 From: "Wu, Jianhua" To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag Thread-Index: AQHYKJNw/2szfUMafUKH4wUIBYpM7ayrnPyQgAsiaTA= Date: Wed, 9 Mar 2022 07:38:05 +0000 Message-ID: References: <20220223085735.70854-1-jianhua.wu@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.6.401.20 dlp-reaction: no-action authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 0b5056f5-ec59-4acf-a36d-08da019fc024 x-ms-traffictypediagnostic: DM5PR1101MB2092:EE_ x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: q7LWJEMRRjBxJ+wFWNKCBdVL4HT6tlaakryztf6t01PeMqnO6H6dW3o3Pkz902CBVgYykJz4bbU/vbK3w6P+53qYH0Z+QgWnmb64IkAPCKKuJ7np6SBnOjvRD+4jekUY3VaP15aI2GrHQcVTdvHhqwF08e2jBwRWVQZDBg6O0oIJYQR6YCfCVNySo1yDYrg+CyI+gls0bpXb1KdR2BWjyT2OZpJX+5NO6G/l8TYtM08h6BSVHFwv5/S0+GVxo9rAKEA0Jbq3G5ZtrVVY3R4yKDa+jZmS+S3jmZk8FLs30oBD43YpUMGjfRkq39HX0B0lMqWH3j5ovQewnCgRw7bzf8ybChP5KAKr2L0+F5qr+Wer2uGbhgcvGNkIUss8Vs9y/x6fgRQguza2RY5z7XBsOupkxqJpRZkG2PnfkWAALb3+KIJ1tnVz/uqL7vGRg6Skq3fAQohVTCXnD6AXZfhZjgp0Eu87HuiDxWXol/ykz7WAqTKfVFBHviQHdBiGgluKPquJvKYqcjYXawN+6DONRtdjTQaT4CC+y2z8JyH+C4DqV9QOj4twyW21CJsXt+1jh4pct7b8q4CuI+o4H8Hun5fYIk3NyOpdJzkSFEn+VkLr0QaYAy7HuLN2MqYSNqN9unPIOej/8Ki/wPYUsUGiO14K3fcsKmyptXFHdYGV+vU2HcQIR6J0SW3GGQLpNZltROpvVOLRAexogWY/hPpdMQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CO1PR11MB4963.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(83380400001)(66946007)(76116006)(66556008)(508600001)(38070700005)(66446008)(66476007)(64756008)(8676002)(55016003)(122000001)(5660300002)(8936002)(6506007)(7696005)(53546011)(30864003)(52536014)(71200400001)(26005)(86362001)(186003)(316002)(38100700002)(2906002)(9686003)(82960400001)(33656002)(6916009); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?SVwCantq46nw6UdjPsrQfu+2jU62dx/NvqeMXPdh0bfT49Kb/zkLcOb2qPcj?= =?us-ascii?Q?5h4fca4onc2Jk+rzYRpT1PGq5+rr54lUZY414OKWVLTjXHDVautxbJeyzKX2?= =?us-ascii?Q?mhHJdlTttYb3BZlZQXCCofI8kmozV/rQVyfNaqHEvedUqHjog7adGRAVhYbc?= =?us-ascii?Q?tVOun76HqgbWfvsGCFAA7lb8ViJ8vICbKBDgkjcwVdvlYOLFogPzEKE8BlV6?= =?us-ascii?Q?D4AkiVLD2rXbGKpzsq6E0DqjTqVkiv31cFEcH1qyhE/kpGIgWU70Uw+4tDNG?= =?us-ascii?Q?KAzCNDdUUldOZF/08BtlYZvAF4jhT48ZSj4hCRISjVqIsPPK79Q8AnYcHeMd?= =?us-ascii?Q?QZWoFPakewYNoytKwkK1h3ZlrGn9I1O3aKOb6lrZI+Hg6WdHZbL5McP0NLUs?= =?us-ascii?Q?xIipNSksZ+dvmaps6KLH3VDWJoOfFWcXKE0J/XsJkX0uvhINtF66X5Vf3JZN?= =?us-ascii?Q?sk4iqIfsiSTh8PBrLkRjQCWNjPA+AYDXGiB7ynenfWL+V9bwxeg4h4ZkJkjy?= =?us-ascii?Q?XJVYzR34YHdOsYOsaK9T8VKV07wQgfxrKuYE0CFc4qEYjjILzk+L4rZNkP91?= =?us-ascii?Q?xG2c/lBRRszTZWL3QVJBRWPBteYh2/h+ZZ+CDO5R/LsU64p96iRc4XLkGRGc?= =?us-ascii?Q?KF0L4l4P8RIB++N+S5WnzYlXYBJFdqfN/6YlBbUZaJ+2HFNC4X+vPcd8UhEP?= =?us-ascii?Q?7jSALe5i27soWUsjbgeWVwW9YRFtntGTPu+bvTwp7USJjX8NTyqviU1TIv7f?= =?us-ascii?Q?OjhYfK7gu7JE9hyJnmgUyiwMEpxgkBfJ334e9IyHPfsqk36qlCDyGPBXViA2?= =?us-ascii?Q?NUKTD0wwDdDlVgBtiwtFB3lVbBbXp5vBSUmw7VB/CM3iLsZ3iA+MQJg4WZ+Q?= =?us-ascii?Q?uRpZNMuCpAAzhtMMclcNhjNyYJU2Fm/xEDSNPs4I+sCrxSkzYNWJ6A1f0NUG?= =?us-ascii?Q?Aukn8K8/whiRw+hq4PTakr9vvesKLsA0bpjqlUWTesTOTa3k/mdylJsQaNPQ?= =?us-ascii?Q?wGpUpgemVkKKuhsg9KHuSqwkcUFMEFi8FT6dsvdU/sgnuthaD9t3zHXvIOxu?= =?us-ascii?Q?VvM0AdNg8ravM4CaC2iHaBrcMhkF6rBYVKpsaryrJpe7NWR7QFaoXzUoqlJf?= =?us-ascii?Q?BJn5yyieRoBa6EchtkwJPsOajZGw+w9x9/I2ci3H8WBX54ybtU+qGbLqzimf?= =?us-ascii?Q?EE2xfHQW4AoR2fkGnmIiz6EIADrTy1WafA0faXPiKbLL2ohXANSOVt9wufop?= =?us-ascii?Q?Xy9CicSER2eynp5xN6Ix6uUIiNJj+8m1KgZwrO6NRe45gmLbvc4xPbSdboUn?= =?us-ascii?Q?hK4yzDD42SUorE49vOSQ5VoU+o4DtOSYnZrVnkENGtOZlZvEi4AoowS+DwAp?= =?us-ascii?Q?u6tu2jWI20htKJUhyQee5fj/NU/Txe5FNa8JL5dLYgr5JQsu3dHmz4LWgcH+?= =?us-ascii?Q?jlLh6958V/daXkOWBK5VJTfI41eFff5qpVRfps9lEoToMslplLJD8NxXW0+/?= =?us-ascii?Q?Vt+2RJ3qYJ32TbSxf5Q1VcNhUiKrWx/ABGs4EUVtXFgKWlhNMcSMlwUvIdGd?= =?us-ascii?Q?vMV2wyT29Jiks15jIF29f0RpselFu9V8mBWm7V8eWFMpNytzpQ3kTIbyJaSX?= =?us-ascii?Q?sw=3D=3D?= MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CO1PR11MB4963.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0b5056f5-ec59-4acf-a36d-08da019fc024 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2022 07:38:06.0464 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: yvzjkH8b4YlvxGG4fcQhjc6WKVWReim5cgET1nAGiXo/wi9jUG5q9ni+m4V7Fe9oAMAB7rJADeZGp4ak7m3dZA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR1101MB2092 X-OriginatorOrg: intel.com Subject: Re: [FFmpeg-devel] [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: Ping. > From: Wu, Jianhua > Sent: Wednesday, March 2, 2022 1:34 PM > To: ffmpeg-devel@ffmpeg.org > Subject: RE: [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag > > Ping. > > From: Wu, Jianhua > > Sent: Wednesday, February 23, 2022 4:58 PM > > To: ffmpeg-devel@ffmpeg.org > > Cc: Wu, Jianhua > > Subject: [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag > > > > From: Wu Jianhua > > > > Signed-off-by: Wu Jianhua > > --- > > configure | 13 +++++++--- > > libavutil/cpu.c | 1 + > > libavutil/cpu.h | 1 + > > libavutil/x86/cpu.c | 8 ++++-- > > libavutil/x86/cpu.h | 1 + > > libavutil/x86/x86inc.asm | 53 > > ++++++++++++++++++++------------------- > > tests/checkasm/checkasm.c | 35 +++++++++++++------------- > > 7 files changed, 63 insertions(+), 49 deletions(-) > > > > diff --git a/configure b/configure > > index 1535dc3c5b..d88c2ae979 100755 > > --- a/configure > > +++ b/configure > > @@ -444,6 +444,7 @@ Optimization options (experts only): > > --disable-fma4 disable FMA4 optimizations > > --disable-avx2 disable AVX2 optimizations > > --disable-avx512 disable AVX-512 optimizations > > + --disable-avx512icl disable AVX-512ICL optimizations > > --disable-aesni disable AESNI optimizations > > --disable-armv5te disable armv5te optimizations > > --disable-armv6 disable armv6 optimizations > > @@ -2098,6 +2099,7 @@ ARCH_EXT_LIST_X86_SIMD=" > > avx > > avx2 > > avx512 > > + avx512icl > > fma3 > > fma4 > > mmx > > @@ -2666,6 +2668,7 @@ fma3_deps="avx" > > fma4_deps="avx" > > avx2_deps="avx" > > avx512_deps="avx2" > > +avx512icl_deps="avx512" > > > > mmx_external_deps="x86asm" > > mmx_inline_deps="inline_asm x86" > > @@ -6128,10 +6131,11 @@ EOF > > elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;; > > esac > > > > - enabled avx512 && check_x86asm avx512_external "vmovdqa32 > > [eax]{k1}{z}, zmm0" > > - enabled avx2 && check_x86asm avx2_external "vextracti128 xmm0, > > ymm0, 0" > > - enabled xop && check_x86asm xop_external "vpmacsdd xmm0, > > xmm1, xmm2, xmm3" > > - enabled fma4 && check_x86asm fma4_external "vfmaddps ymm0, > > ymm1, ymm2, ymm3" > > + enabled avx512 && check_x86asm avx512_external "vmovdqa32 > > [eax]{k1}{z}, zmm0" > > + enabled avx512icl && check_x86asm avx512icl_external > > + "vpdpwssds > > zmm31{k1}{z}, zmm29, zmm28" > > + enabled avx2 && check_x86asm avx2_external "vextracti128 > > xmm0, ymm0, 0" > > + enabled xop && check_x86asm xop_external "vpmacsdd xmm0, > > xmm1, xmm2, xmm3" > > + enabled fma4 && check_x86asm fma4_external "vfmaddps > ymm0, > > ymm1, ymm2, ymm3" > > check_x86asm cpunop "CPU amdnop" > > fi > > > > @@ -7471,6 +7475,7 @@ if enabled x86; then > > echo "AVX enabled ${avx-no}" > > echo "AVX2 enabled ${avx2-no}" > > echo "AVX-512 enabled ${avx512-no}" > > + echo "AVX-512ICL enabled ${avx512icl-no}" > > echo "XOP enabled ${xop-no}" > > echo "FMA3 enabled ${fma3-no}" > > echo "FMA4 enabled ${fma4-no}" > > diff --git a/libavutil/cpu.c b/libavutil/cpu.c index > > 1368502245..833c220192 > > 100644 > > --- a/libavutil/cpu.c > > +++ b/libavutil/cpu.c > > @@ -137,6 +137,7 @@ int av_parse_cpu_caps(unsigned *flags, const char > *s) > > { "cmov", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = > > AV_CPU_FLAG_CMOV }, .unit = "flags" }, > > { "aesni", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = > > AV_CPU_FLAG_AESNI }, .unit = "flags" }, > > { "avx512" , NULL, 0, AV_OPT_TYPE_CONST, { .i64 = > > AV_CPU_FLAG_AVX512 }, .unit = "flags" }, > > + { "avx512icl", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = > > AV_CPU_FLAG_AVX512ICL }, .unit = "flags" }, > > { "slowgather", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = > > AV_CPU_FLAG_SLOW_GATHER }, .unit = "flags" }, > > > > #define CPU_FLAG_P2 AV_CPU_FLAG_CMOV | AV_CPU_FLAG_MMX diff > -- git > > a/libavutil/cpu.h b/libavutil/cpu.h index ce9bf14bf7..9711e574c5 > > 100644 > > --- a/libavutil/cpu.h > > +++ b/libavutil/cpu.h > > @@ -54,6 +54,7 @@ > > #define AV_CPU_FLAG_BMI1 0x20000 ///< Bit Manipulation > Instruction > > Set 1 > > #define AV_CPU_FLAG_BMI2 0x40000 ///< Bit Manipulation > Instruction > > Set 2 > > #define AV_CPU_FLAG_AVX512 0x100000 ///< AVX-512 functions: > > requires OS support even if YMM/ZMM registers aren't used > > +#define AV_CPU_FLAG_AVX512ICL 0x200000 ///< > > > +F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/V > > AES/VPCLMULQD > > +Q > > #define AV_CPU_FLAG_SLOW_GATHER 0x2000000 ///< CPU has slow > gathers. > > > > #define AV_CPU_FLAG_ALTIVEC 0x0001 ///< standard > > diff --git a/libavutil/x86/cpu.c b/libavutil/x86/cpu.c index > > 7b13fcae91..d6cd4fab9c 100644 > > --- a/libavutil/x86/cpu.c > > +++ b/libavutil/x86/cpu.c > > @@ -150,9 +150,13 @@ int ff_get_cpu_flags_x86(void) > > rval |= AV_CPU_FLAG_AVX2; #if HAVE_AVX512 /* F, CD, BW, > > DQ, VL */ > > if ((xcr0_lo & 0xe0) == 0xe0) { /* OPMASK/ZMM state */ > > - if ((rval & AV_CPU_FLAG_AVX2) && (ebx & 0xd0030000) == > > 0xd0030000) > > + if ((rval & AV_CPU_FLAG_AVX2) && (ebx & 0xd0030000) == > > + 0xd0030000) { > > rval |= AV_CPU_FLAG_AVX512; > > - > > +#if HAVE_AVX512ICL > > + if ((ebx & 0xd0200000) == 0xd0200000 && (ecx & 0x5f42) == > 0x5f42) > > + rval |= AV_CPU_FLAG_AVX512ICL; #endif /* > > +HAVE_AVX512ICL */ > > + } > > } > > #endif /* HAVE_AVX512 */ > > #endif /* HAVE_AVX2 */ > > diff --git a/libavutil/x86/cpu.h b/libavutil/x86/cpu.h index > > 937c697fa0..40a1eef0ab 100644 > > --- a/libavutil/x86/cpu.h > > +++ b/libavutil/x86/cpu.h > > @@ -80,6 +80,7 @@ > > #define EXTERNAL_AVX2_SLOW(flags) CPUEXT_SUFFIX_SLOW2(flags, > > _EXTERNAL, AVX2, AVX) > > #define EXTERNAL_AESNI(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, > > AESNI) > > #define EXTERNAL_AVX512(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, > > AVX512) > > +#define EXTERNAL_AVX512ICL(flags) CPUEXT_SUFFIX(flags, _EXTERNAL, > > AVX512ICL) > > > > #define INLINE_AMD3DNOW(flags) CPUEXT_SUFFIX(flags, _INLINE, > > AMD3DNOW) > > #define INLINE_AMD3DNOWEXT(flags) CPUEXT_SUFFIX(flags, _INLINE, > > AMD3DNOWEXT) > > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm index > > 01c35e3a4b..251ee797de 100644 > > --- a/libavutil/x86/x86inc.asm > > +++ b/libavutil/x86/x86inc.asm > > @@ -817,32 +817,33 @@ BRANCH_INSTR jz, je, jnz, jne, jl, jle, jnl, > > jnle, jg, jge, jng, jnge, ja, jae, > > > > ; cpuflags > > > > -%assign cpuflags_mmx (1<<0) > > -%assign cpuflags_mmx2 (1<<1) | cpuflags_mmx > > -%assign cpuflags_3dnow (1<<2) | cpuflags_mmx > > -%assign cpuflags_3dnowext (1<<3) | cpuflags_3dnow > > -%assign cpuflags_sse (1<<4) | cpuflags_mmx2 > > -%assign cpuflags_sse2 (1<<5) | cpuflags_sse > > -%assign cpuflags_sse2slow (1<<6) | cpuflags_sse2 > > -%assign cpuflags_lzcnt (1<<7) | cpuflags_sse2 > > -%assign cpuflags_sse3 (1<<8) | cpuflags_sse2 > > -%assign cpuflags_ssse3 (1<<9) | cpuflags_sse3 > > -%assign cpuflags_sse4 (1<<10)| cpuflags_ssse3 > > -%assign cpuflags_sse42 (1<<11)| cpuflags_sse4 > > -%assign cpuflags_aesni (1<<12)| cpuflags_sse42 > > -%assign cpuflags_avx (1<<13)| cpuflags_sse42 > > -%assign cpuflags_xop (1<<14)| cpuflags_avx > > -%assign cpuflags_fma4 (1<<15)| cpuflags_avx > > -%assign cpuflags_fma3 (1<<16)| cpuflags_avx > > -%assign cpuflags_bmi1 (1<<17)| cpuflags_avx|cpuflags_lzcnt > > -%assign cpuflags_bmi2 (1<<18)| cpuflags_bmi1 > > -%assign cpuflags_avx2 (1<<19)| cpuflags_fma3|cpuflags_bmi2 > > -%assign cpuflags_avx512 (1<<20)| cpuflags_avx2 ; F, CD, BW, DQ, VL > > - > > -%assign cpuflags_cache32 (1<<21) > > -%assign cpuflags_cache64 (1<<22) > > -%assign cpuflags_aligned (1<<23) ; not a cpu feature, but a function > variant > > -%assign cpuflags_atom (1<<24) > > +%assign cpuflags_mmx (1<<0) > > +%assign cpuflags_mmx2 (1<<1) | cpuflags_mmx > > +%assign cpuflags_3dnow (1<<2) | cpuflags_mmx > > +%assign cpuflags_3dnowext (1<<3) | cpuflags_3dnow > > +%assign cpuflags_sse (1<<4) | cpuflags_mmx2 > > +%assign cpuflags_sse2 (1<<5) | cpuflags_sse > > +%assign cpuflags_sse2slow (1<<6) | cpuflags_sse2 > > +%assign cpuflags_lzcnt (1<<7) | cpuflags_sse2 > > +%assign cpuflags_sse3 (1<<8) | cpuflags_sse2 > > +%assign cpuflags_ssse3 (1<<9) | cpuflags_sse3 > > +%assign cpuflags_sse4 (1<<10)| cpuflags_ssse3 > > +%assign cpuflags_sse42 (1<<11)| cpuflags_sse4 > > +%assign cpuflags_aesni (1<<12)| cpuflags_sse42 > > +%assign cpuflags_avx (1<<13)| cpuflags_sse42 > > +%assign cpuflags_xop (1<<14)| cpuflags_avx > > +%assign cpuflags_fma4 (1<<15)| cpuflags_avx > > +%assign cpuflags_fma3 (1<<16)| cpuflags_avx > > +%assign cpuflags_bmi1 (1<<17)| cpuflags_avx|cpuflags_lzcnt > > +%assign cpuflags_bmi2 (1<<18)| cpuflags_bmi1 > > +%assign cpuflags_avx2 (1<<19)| cpuflags_fma3|cpuflags_bmi2 > > +%assign cpuflags_avx512 (1<<20)| cpuflags_avx2 ; F, CD, BW, DQ, VL > > +%assign cpuflags_avx512icl (1<<25)| cpuflags_avx512 > > + > > +%assign cpuflags_cache32 (1<<21) > > +%assign cpuflags_cache64 (1<<22) > > +%assign cpuflags_aligned (1<<23) ; not a cpu feature, but a function > variant > > +%assign cpuflags_atom (1<<24) > > > > ; Returns a boolean value expressing whether or not the specified > > cpuflag is enabled. > > %define cpuflag(x) (((((cpuflags & (cpuflags_ %+ x)) ^ (cpuflags_ %+ x)) - > > 1) >> 31) & 1) > > diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c > > index f74125e810..e77b4ec20f 100644 > > --- a/tests/checkasm/checkasm.c > > +++ b/tests/checkasm/checkasm.c > > @@ -220,23 +220,24 @@ static const struct { > > { "MMI", "mmi", AV_CPU_FLAG_MMI }, > > { "MSA", "msa", AV_CPU_FLAG_MSA }, > > #elif ARCH_X86 > > - { "MMX", "mmx", AV_CPU_FLAG_MMX|AV_CPU_FLAG_CMOV }, > > - { "MMXEXT", "mmxext", AV_CPU_FLAG_MMXEXT }, > > - { "3DNOW", "3dnow", AV_CPU_FLAG_3DNOW }, > > - { "3DNOWEXT", "3dnowext", AV_CPU_FLAG_3DNOWEXT }, > > - { "SSE", "sse", AV_CPU_FLAG_SSE }, > > - { "SSE2", "sse2", AV_CPU_FLAG_SSE2|AV_CPU_FLAG_SSE2SLOW }, > > - { "SSE3", "sse3", AV_CPU_FLAG_SSE3|AV_CPU_FLAG_SSE3SLOW }, > > - { "SSSE3", "ssse3", AV_CPU_FLAG_SSSE3|AV_CPU_FLAG_ATOM }, > > - { "SSE4.1", "sse4", AV_CPU_FLAG_SSE4 }, > > - { "SSE4.2", "sse42", AV_CPU_FLAG_SSE42 }, > > - { "AES-NI", "aesni", AV_CPU_FLAG_AESNI }, > > - { "AVX", "avx", AV_CPU_FLAG_AVX }, > > - { "XOP", "xop", AV_CPU_FLAG_XOP }, > > - { "FMA3", "fma3", AV_CPU_FLAG_FMA3 }, > > - { "FMA4", "fma4", AV_CPU_FLAG_FMA4 }, > > - { "AVX2", "avx2", AV_CPU_FLAG_AVX2 }, > > - { "AVX-512", "avx512", AV_CPU_FLAG_AVX512 }, > > + { "MMX", "mmx", AV_CPU_FLAG_MMX|AV_CPU_FLAG_CMOV }, > > + { "MMXEXT", "mmxext", AV_CPU_FLAG_MMXEXT }, > > + { "3DNOW", "3dnow", AV_CPU_FLAG_3DNOW }, > > + { "3DNOWEXT", "3dnowext", AV_CPU_FLAG_3DNOWEXT }, > > + { "SSE", "sse", AV_CPU_FLAG_SSE }, > > + { "SSE2", "sse2", > AV_CPU_FLAG_SSE2|AV_CPU_FLAG_SSE2SLOW }, > > + { "SSE3", "sse3", > AV_CPU_FLAG_SSE3|AV_CPU_FLAG_SSE3SLOW }, > > + { "SSSE3", "ssse3", AV_CPU_FLAG_SSSE3|AV_CPU_FLAG_ATOM }, > > + { "SSE4.1", "sse4", AV_CPU_FLAG_SSE4 }, > > + { "SSE4.2", "sse42", AV_CPU_FLAG_SSE42 }, > > + { "AES-NI", "aesni", AV_CPU_FLAG_AESNI }, > > + { "AVX", "avx", AV_CPU_FLAG_AVX }, > > + { "XOP", "xop", AV_CPU_FLAG_XOP }, > > + { "FMA3", "fma3", AV_CPU_FLAG_FMA3 }, > > + { "FMA4", "fma4", AV_CPU_FLAG_FMA4 }, > > + { "AVX2", "avx2", AV_CPU_FLAG_AVX2 }, > > + { "AVX-512", "avx512", AV_CPU_FLAG_AVX512 }, > > + { "AVX-512ICL", "avx512icl", AV_CPU_FLAG_AVX512ICL }, > > #elif ARCH_LOONGARCH > > { "LSX", "lsx", AV_CPU_FLAG_LSX }, > > { "LASX", "lasx", AV_CPU_FLAG_LASX }, > > -- > > 2.17.1 Hi there, These patches have been sent for two weeks but got zero response so far. Could the maintainers of CPU flags and native HEVC decoding help review this patchset? Thanks, Jianhua _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".