From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 13C0644C6C for ; Mon, 14 Nov 2022 13:30:45 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0F03F68BDA5; Mon, 14 Nov 2022 15:30:43 +0200 (EET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9A7DA68BBA0 for ; Mon, 14 Nov 2022 15:30:36 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668432641; x=1699968641; h=from:to:subject:date:message-id:references:in-reply-to: content-transfer-encoding:mime-version; bh=dswggSS4AzaMBQNvMMujMpwLg4TumicpJNi2qqFcoWU=; b=eyeJebzB40hSavLdTW1PpWcM+hsvY8PrIfM05Q0ok7JoAoB9N2dEHezA y8g7F4euZOCVSjXN1D/iDAg+M+pgV5b5F75iluf7Tt4MrHZ3sXpiC4FUZ rzJBY4ysp7w14lS6H41O7whrTUrYoYbSU300vPT6iWHmYuoQDRvJV/xQr NNDYZzJnxXDAr/LuC8v/GNFO9w/msDifH2S44GbOrybNU8tpF2IyYBTpB W31i4fXIL/df6uq92vmnCEuu9l7lV9hKde82hiRPYgWuWdKCHxKnYzUfM wBaWyD4m4dPVKViB2TPVyu0Oyv1Yngu1YuixmAOs+UqTnwedt7yr1N7vQ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10531"; a="338754166" X-IronPort-AV: E=Sophos;i="5.96,161,1665471600"; d="scan'208";a="338754166" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Nov 2022 05:30:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10531"; a="616310477" X-IronPort-AV: E=Sophos;i="5.96,161,1665471600"; d="scan'208";a="616310477" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by orsmga006.jf.intel.com with ESMTP; 14 Nov 2022 05:30:33 -0800 Received: from fmsmsx610.amr.corp.intel.com (10.18.126.90) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 14 Nov 2022 05:30:33 -0800 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx610.amr.corp.intel.com (10.18.126.90) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 14 Nov 2022 05:30:32 -0800 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31 via Frontend Transport; Mon, 14 Nov 2022 05:30:32 -0800 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (104.47.70.106) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2375.31; Mon, 14 Nov 2022 05:30:31 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RnfDoxpDpb+m23A7SJN1r3g0EO9OAzuBSbZPgSK8Xc/lvIgGE5lKhuF9Cl2wAV2ELfeMcUIWC92HbZxiVQ39WJLZ8xP9J9C0iF/OLo0dpJY1e0OY+Lt63zqnC5f/cNh+O1gSykIjkQIugxCLUrfWldI4IUhqQ8Kz0EINnzcmNoQYisWoCDPDXk7Rj1rX8mcxgO3OBVLpXnaipLYPNkc5jfurWTTNLOalgsux73oNStCDdojUgY2nMZZilxSp2vrKPHpeQn8CHdYxTGteqlyz7FlUgqfj6ftAGWzJm1kqxvhZJjtiFe07CZXMH2ZnfQRC3GB9sV79bliyhKfBIEms9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dswggSS4AzaMBQNvMMujMpwLg4TumicpJNi2qqFcoWU=; b=UCMd7ZfIKM5Mrfo3R2M+14LYgKLHzeXNC8v0uJpYFqxlqc2rn5+1Wa8A4NH0mbtVlj8aMSp6UMk/U5E/hg082T61tQeKB4jTKkrdqZVRXmuQKoz+W7T1CbKzi5XyTCblSV/7xDStD7mtB+wKsgZiWsGeYZW6DGbzPYRwPq6EaImAC42j25mjtDPRW1EYxHVB+nEsfx2Hy9YBV/8jCwjQq6S5T+tbP4YWX7qOKiimsgexHw60TZwPOeeRMXehr7Hb9vLduSxVYiHRVTvPuIzMkJndKUIktN2Qe8DOEaN4GQ3Sz6wT3qA93QfGnEttYn7+5xICcGPQXTO1t7Q8dgRoNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from BN6PR11MB1746.namprd11.prod.outlook.com (2603:10b6:404:fb::20) by CY5PR11MB6140.namprd11.prod.outlook.com (2603:10b6:930:28::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.17; Mon, 14 Nov 2022 13:30:24 +0000 Received: from BN6PR11MB1746.namprd11.prod.outlook.com ([fe80::1565:8f45:a36f:8704]) by BN6PR11MB1746.namprd11.prod.outlook.com ([fe80::1565:8f45:a36f:8704%9]) with mapi id 15.20.5813.017; Mon, 14 Nov 2022 13:30:24 +0000 From: "Wang, Bin" To: FFmpeg development discussions and patches Thread-Topic: [FFmpeg-devel] [PATCH v7] libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI Thread-Index: AQHY8Cqnk334s11ZYkKsTF5Sp70+yK4+cBGAgAAG/hA= Date: Mon, 14 Nov 2022 13:30:24 +0000 Message-ID: References: <20221104082925.25598-1-bin.wang@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-version: 11.6.500.17 dlp-product: dlpe-windows dlp-reaction: no-action authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: BN6PR11MB1746:EE_|CY5PR11MB6140:EE_ x-ms-office365-filtering-correlation-id: 7c45c01c-f7f2-426a-e38c-08dac64462b6 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: CaK7UrE9DbHosQnPWIK9lzjCczpDZwzMqWvcgLJ+2uPUDedO8KA+A8fPjfJlwyT9yz8tkHD3R61TCU9T5IIC/4p6vIsiSNTUVKCYbt8a6w64OMlLlx0mpBxuIY1oBt+SS7mWN2H1Sr1YsVkreQd2kJbKRv9Jsc99RVY2T/3f/nntsABgIwvExnBhCPn2CGvfQlN+uiNxIz0AODJ/YJsQUfModG9f2weoHKiQ5XNZqsSjwu5F6H8mow4aR5t4DykkXKXpR9Ej5PzrCM/ulpgA8fwAtIaTFoLNJ8TSsoXAvkPMLChsNPN9lWLHmrYMRHSOBN5hZlndld2QeNcmm1kzj25K+SA80tysZley0yKIiy1o8h1ZJRFBsqskesdhMJhsqFIA5EtIstl8HkjcRObBiy/Uxe9In35lbbBoY4Bzb8TrdsUjgUBMD+GPuHtTDwMefWjaDSxl5rtHC2w/Za8q88n4Wt+Z7DHD0r7d5+SEbx7sjMLCsnGDQdcKZJXiMARWz3jicOxDuWfFpcHs7+H2aObhd+aZolRAcRXSa0HZpc+KUvioublCsUsWQh9wUWVNSa37kOgpQz34e64frnlA4vd5Rgszpxyy9KqBZJZfSqjbr/Jl3J7cNRfYpY/FzDCcpmvmTLZaXy8vHhcGtzifXo1iff1i+sQVeP1PkE8TC1tyhgHQdD2yQmAt5Xd9zChykPLLR2aBAVroFlKrwD0tMd7GMBiGXioFXoeNVN5ASaPmfeVgQO1/gY9bju35qHhwl3ruyuhVmV4W1HPbRCF+DOh/eBAiZzjB3/lBn94asLA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BN6PR11MB1746.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(376002)(136003)(396003)(346002)(366004)(39860400002)(451199015)(66899015)(316002)(66556008)(66446008)(41300700001)(82960400001)(66476007)(66946007)(64756008)(76116006)(8676002)(122000001)(38070700005)(38100700002)(6916009)(478600001)(9686003)(26005)(8936002)(52536014)(83380400001)(966005)(2906002)(33656002)(86362001)(5660300002)(186003)(7696005)(6506007)(55016003)(71200400001); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?Gp6yvmGC2/Zz/WKaTodQ5DpV0cPdBmylGKPgY99vd5lHAfAwmVFraBN8eClr?= =?us-ascii?Q?4Kf3NfX5fTmB2DiJxE34lJty0LcZ0SLtFkqBsu+L0Y1m+ybau9ArcAD/2Ytb?= =?us-ascii?Q?hGGVJRbLlyG+QRkZK1oOhoGo5pjCvvipJz0cmpbM1BSm/kGB9D15gNm65x9E?= =?us-ascii?Q?F5hVUqbh9t2bo+UH4NjEP0/JqBMxQ2j9BR9jSZ6E6RTXc7CFK2fTIVuPVBFz?= =?us-ascii?Q?LAKSYyGTAXDGo4TfYLc5T9KAhUsd2KIlMlahCc3e5Ga58LjJ+yz1bGAYtDEm?= =?us-ascii?Q?r+gc3QA6dteK60KFPx6tvE9CBeKurFt3a4K1cvx5D9FtmrighaDk95Ae8MnT?= =?us-ascii?Q?8QCMsakC8pH1jVDnYgntTvDXImoiKJQ969GPEAF4AvUrzk7olG+WOGv6eI8d?= =?us-ascii?Q?FHFDdK2P/d817w1SyHF0vo2UIR4zDVFuVV76SbKF0AtOD6OE/66ydUC2FOZD?= =?us-ascii?Q?hfg0eMtd14hYPjLZa2/YMuz3XwX+6zcvITHQmnYfhAMWl/CAxxgB0PpI+F04?= =?us-ascii?Q?o+ykk6a7NSRVQSVOTa73CEOjfBOWbXu9rs1vBa8g1bz8/SsysgGZ7S0C7RLc?= =?us-ascii?Q?iQKUA0HOPb/T//guJvivAp33prZCSnNkdTsEPtcZN7Z5U0TM51eCTi5yzuKg?= =?us-ascii?Q?dEW7JIwStDrC+qWRLv4HAR/ezeCTby787DV1KknGU12TKyGgZhGM7UEYdHH2?= =?us-ascii?Q?7q4KS+iIhiz8wMcRPAnRiMvzOTxvOFnWgDGSn30WG//K5xbWOF1c1pVrBWRh?= =?us-ascii?Q?PNWYJBtKnl4Bop0w581dA3NtjbN+QvNLLUEX4AD2ziUzokhWA8MhtuQKQTmY?= =?us-ascii?Q?c9qabEEPotrLxYOzYytJ8C6EFh+4Op/k15OtgrSR5dnHGfghScfZZUOc8Fxz?= =?us-ascii?Q?J9lul63w9qhwskz1Dwjb9SdkugJVVAyHWgEYkG3Si44qLZdwVvG+hEf/RXUp?= =?us-ascii?Q?QDxtP7oMuLNLi4Uzk38Ts9HVRwFe7hEm3lAFJk0Y5j7qKsXC2W9lvz7xQPi4?= =?us-ascii?Q?lg8m1Z5Wt6Gn3bP4mqbnK6XNcK9lvUGH6hH1IXHde1gOssWJjSmoiZx+QrGX?= =?us-ascii?Q?aTeSAou91/a4fJS8Z4oR1Rt00a3wZFPMNgP8rIK6mvjSPQzkEYsFCkaN/acx?= =?us-ascii?Q?9bM/rfU2LIR4U5zmH3wlGN+ZersaCJHchrRbOxEGtzleRJ7L+S3g+NuYTp+I?= =?us-ascii?Q?YYFkZec+ihlRerYf7MMOzmTXyjiXRT+7N1G2YpPX7RdBTcTmiSwftt926FVo?= =?us-ascii?Q?qoRG6bJKlSO4YlLstRNkyr8+MzyWpyRw7PwU01CY/UgfaRRZOcbWlMFFhIoo?= =?us-ascii?Q?H7hqYz6sbhpfpKEbmwbctVCVZeEPZonLG2d7n6qJpnuiEDD0Z7TJpSTB4zhy?= =?us-ascii?Q?Qsu6dQKIgw2WUZQ1BkQK7HAotHwkeSKdZyFAiOuOniie7SZJdkmDQkP6pCov?= =?us-ascii?Q?cjJhR4CogjcVgOGb2RfSSKtm3EBiugr5XHrTKecOHk3F7Jziax/cOh1v1lJx?= =?us-ascii?Q?DeRoox8GmtgpWl7SkILYyMaCCTzghW5Un45zFwdNnqWKfQx1jyZNoG6wyWH6?= =?us-ascii?Q?bHWJ2lpHLYe7yuOupLE=3D?= MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BN6PR11MB1746.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7c45c01c-f7f2-426a-e38c-08dac64462b6 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Nov 2022 13:30:24.2262 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: q/56P4nZqX4SeV4oXYoblilnV/at+FjF3vdK0rBxj5sP0YjKLdoGjmECJBli0odAK+wEArY5+8mw8xveA253ng== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY5PR11MB6140 X-OriginatorOrg: intel.com Subject: Re: [FFmpeg-devel] [PATCH v7] libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: > By using xmm# you're not taking into account any x86inc SWAPing, so this is > using xmm0 and xmm1 where the single scalar float input arguments reside (at > least on unix64), instead of xm0 and xm1 (xmm16 and xmm17) where the > broadcasted scalars were stored. > This, again, only worked by chance on unix64 because you're using scalar fmadd, > and shouldn't work at all on win64. > > Also, all these as is are being encoded as VEX, not EVEX, but it should be fine > leaving them untouched instead of using xm#, since they will be shorter (five > bytes instead of six for some) by using the lower, non callee-saved regs. Thanks for the help. I'm not familiar with WIN64 asm. So what I need to do is change the WIN64 swap from: SWAP xmm0, xmm2 SWAP xmm1, xmm3 To: VBROADCASTSS m0, xmm2 VBROADCASTSS m1, xmm3 Is that correct? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".