From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id F3C1544407 for ; Fri, 9 Sep 2022 12:43:27 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5E55F68BB0F; Fri, 9 Sep 2022 15:43:26 +0300 (EEST) Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05olkn2010.outbound.protection.outlook.com [40.92.91.10]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8A7D068BB01 for ; Fri, 9 Sep 2022 15:43:20 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dK4sOJ/sh0EbAoBxIo891xVD+A9oyR5NJb7ld+f0p+h23X08Rj06Mwu7yH3tEoe73gvQx8jaoPe4se4I//1H2fJSRMmlzpBJoMDmtgJkXtaQZr0T70NrQVDx61hc8Z5NOl53I9K5FbfhM/BhyKQcr0S/pEfXVwSdy3SYIl/0/oeoTw7rxEAJVnM/tbiPF33TkvFFmnCjKfLkqRatX50oakGRnHLW6rN+Iz/UXHKfYyuKRjzMjf5RumQJp1vLwW07mIPuVlsAJo89WKOO9woRktrsKtTZb0KQrz/1DMwc7+hOtXb93rGzfvC76fA5Ge6skthFEuvh6fpMb+69teQamA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=r3L8WbQ4Q/dMo1EL5KLleeHvYlRzC4kKlEiuiDC2Vh0=; b=H23heVx+E6EStsIA3mqfpqLxu8SZ5SivTI1ke+ky4SFmjTSf7oIxIGrGm5MzGKmbHldULxGcH1kaLPsQtZXAvgCF7oDfYFwdkvRJN0DmLwDz0W/Kcf5I4FIfjUU9r1Hf+VdlzF5Qe7O+xzmP5P4ZJLo9XAfpMnRfBC/mxnMgnJaKkLjLh7nF/Ey7J67wyGdAWn+iE134A7ygjdWAn2T4ChuoT/5h3W4qYppVBJSyzXaDBDLOY/MHlu89Drt2/UekvNlpVcVnjH01C/2MoJojrmRWZCZ7naYNmvd2NdL4oV1YthsiDu2SGgUL61qegCMyPeBqrfdfYJGAlENw1DKIZA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=r3L8WbQ4Q/dMo1EL5KLleeHvYlRzC4kKlEiuiDC2Vh0=; b=H+Zk2oaPcb/OCChU0BqmlkvnK4jHiCTKdDDtr5Nq5vVkRvpyAaNbLu3KVyV82q0HinEWF5PDlPOpPY4GlfG3jAu3q+0SPmj6EAQWiPdLfsp1zucUNxZFvKGr4xS2pVcxgBeV1iHdZSx91aX0BolgOcbyIrkb79NE9IY5nwf2EWjADwHvgQu+MW798FmBO71qK+CijkafVE5wEsLKkFQmX9brxZMAbmMDxcHFSRruw1TEFpTawKHp8eVj47RkePMa7YV4rlX9MSGhqbO6xpUbwYSgpqTh1E0DsL1gl7YU0YDzIVDwRLuEB7oULeKuEGqLM/bNa3e2hnbp7iWPnf11Zw== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by DU2P250MB0286.EURP250.PROD.OUTLOOK.COM (2603:10a6:10:27b::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5566.21; Fri, 9 Sep 2022 12:43:18 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::611e:1608:45cb:b58a]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::611e:1608:45cb:b58a%4]) with mapi id 15.20.5612.019; Fri, 9 Sep 2022 12:43:18 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 14:43:15 +0200 Message-ID: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: X-TMN: [J8xrTkDZO1WSre3Q7WwF7YJjtsF0znlPNXH/qj4cbbU=] X-ClientProxiedBy: FR0P281CA0073.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:1e::13) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <20220909124315.2780058-1-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|DU2P250MB0286:EE_ X-MS-Office365-Filtering-Correlation-Id: 5ddb34c1-3f8e-4b51-7bf8-08da9260df25 X-MS-Exchange-SLBlob-MailProps: =?us-ascii?Q?vGeHJwoxm7Z7WxZx8SmGU1IDXbyti4V6M2qfV71ZjA0opq2cUm0GNCVkvJvG?= =?us-ascii?Q?XdJ8eXx8vbP2RmrH9Q8v6slgOSGCp0j5ompWN9s9Cvbz+KUjXCmZWH0aTUtN?= =?us-ascii?Q?S70pkWXcJk8mPddm6J9EuB2unwsRh64Ak5fenM+fXtwvdrUSbk00H/EoL5Hn?= =?us-ascii?Q?T4igKpXIAzaIOdkURhSdGa1GwRZ7GcyXuC+fhYDQegzCofH4eYtxS7xfjySk?= =?us-ascii?Q?GmkPq7uj3fu+X5RKV4eh0v1uY3lklkNQu6NI5H6IdPjeF+RmvBQwU9zJ6uC+?= =?us-ascii?Q?7aGMisse8Mlh3AEa0XBKpD6erVZUF0cRZTQb2xFxH71O3724TDQ5v0K5cRF8?= =?us-ascii?Q?0qcEm7zqRXXwINW29sXRioGtzW39P5jFXbGG9rv2JGRcxz2lFD5UY6n4/dD6?= =?us-ascii?Q?g3Y0vX/YyKKfCbiL6JJq3RdxKnXrPa5ASHKArBqSML07kcwnmY/iOJ94P11u?= =?us-ascii?Q?mDxd07qzwY1X7lcksqy7ywPasM3+Ud2eHTXM7pu+NOXl+gnpp3nXDqm/Z2ym?= =?us-ascii?Q?snLY2OMpnaHnOlbvMJaK4yAz6f6vFWHI13QXErCpX7QMvcrlTmTFxi+LKyjn?= =?us-ascii?Q?kZU238CtKXkK2uGpimBFJ1DzqyXvIAiActaIzKu32nwdMrI5kzJPZpln6Vvu?= =?us-ascii?Q?LnmQy51ul3AReCpoa2+sE8onIozH96dVoAt0vp76d1WculzZ5Eu7+/Qut7NX?= =?us-ascii?Q?ZKF6JDGG3FQ/Zgm4aqTFyfLlbBfWf0j+cvzpFxh5z9qsvBhWqmMAAOTletZR?= =?us-ascii?Q?OQcYMQcLW2vAoctolMIErZarhcvVi+TTgieGHX6bEWEsmNQMgkJmGURHzF6i?= =?us-ascii?Q?CT95ybPyM17c4+SkgagOPqnqyRSi4+zV9LtrVG2rNZ4gUrEku0X1Sjua50sZ?= =?us-ascii?Q?D5/8TknxfrsN3Ob+EnZi2KQ1SOSQZcmOp1G7Evlw5WogA7IeoVTKn29jsrGm?= =?us-ascii?Q?ppYfI+mj56VNB+ucu3YDv6CV5eLl8wh363pJOoIT/OzaI/ZKsXUZMIVG4UPf?= =?us-ascii?Q?QtufXAwctOCqRK5Z5N+3/HbdyiDo4m6+NS9XW8bN5qsCR0k=3D?= X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: K6i+eEZ5qWaqgiLx0JTdGHnsc2Hm9D3zhTUqDc6eI35HlJQ9lLvUXdUs3oaIIqTonRYul1vhJskvH1WWIx5NEdcPoujjjIT9/sTYAUO9zSW5sp9qwfVsd/spCwryFGw5m+0529T1lqRQioQ+jUjo0GW7umpu3w7F6rIGC6YM+7gS0jovHZ/RiFTV1leKfSPQO5SaBJMzkOUOn+5RpO42v5kSMiAP8TPXlOzTMRhU0ojrTUmAdnF2c8RwCfMdPc9yfbYkNxOf2uO0fiht3XOkDqaeoC0HQzL8+7Rzp3QzLAMUrILg3oJXRmXmR+5yhahDI4TYTiii2WT6D9pU+FYaVBzJHKQlsf8qDpbVdI60ItAADYUe81XKoQ+SByvPWqRf7ndh6VRX5cNhLztuQFp85ly+kDv0Xj0hbKHLbwDP2sVkDFoGfoFS9GhlGY1V/4dz9/zRrCziSWBRpDzPBITYqBWv+OHSZFcuoJYJn3/6kxnk/VQHH5bcOzoj685GBBjJgbqnmknlYmkYrrl468ek/Zexjmh9SRZIaYJ8zKlYp1ESIusRNAxUQRocuxLcBgmDCtuAngQ8UhQUl8e+HpVHHGHAz03RoVkB4gZEsnNfso0KlbfkGR0Qwiib4AUjvTqNjWIgvIMVML4rrn253UNJKw== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?YZN1uouIYMjHq//dI+kzwPcZr9/IBRDlYsk1mm1PAtnSiJcmGNgxfRbSJhxP?= =?us-ascii?Q?QibsAm+XVb/2nKc8UpWZg6Fr7E3tqHQECAkVaEkXrzsmF/QoGQK3w7Z/jEGo?= =?us-ascii?Q?BrjQ7tIF2wf7k6J2GyBGToReOkMhYutIznvuQFg7Yc/IdtvJECV3WEVHMOy1?= =?us-ascii?Q?EnpjXwmkmZ8HN/IczHVsyCUR1+kZzCdaAO6EIutDtrGp5rlrFpxU/+rNnoBE?= =?us-ascii?Q?TsteCJoVC2ZuNnRVOvnEHS2As+Ht4XIgOsE7h1QhZUP99xaOJ64R4nuvkx0l?= =?us-ascii?Q?1V1ffwAeXDgJE7866S54fR5IKPFoO2Br5YQqmFdKR1UMI2ZmEifq3EP0O5Un?= =?us-ascii?Q?62aFMOxTQQruD/BzVgTWoEh71u6JMHognrs7fS6/UYmbzgSoo5bUN8toLY3n?= =?us-ascii?Q?qSeaLOuRqFfcYFbnLIKHrWqZJCAoesBy+eMIxmz9TZ6kZlLSdSVS7tLcHJxx?= =?us-ascii?Q?ioorRnrSqqeIGOUXWaJ8XhuIL81EekODl3e/pGmN/AwD8IqfSXyS7evitjiM?= =?us-ascii?Q?ClW0roOec4wemeu1hKnrxmDhnBwym/WThVXlAxttHijWFMpDHPd+fxSYZhjY?= =?us-ascii?Q?kLTgSlub61SWRGq1EOWIekO3dpF8nawTBhfVaoRMJEMocuzFq91AQRrspzgV?= =?us-ascii?Q?q5rVOcYQoKkG+HFFTsSIubplGMR59606SBxi+v5fSNrCfpDLkBD+IbaYyivr?= =?us-ascii?Q?pzpa/drHjnHFHEuSYa3k3hEYSp0CvMC1MCPegpcrdTsuvzuuKAqJxCz5dk8u?= =?us-ascii?Q?eYQ3PlG4L7pZRvcfEqr6T5MEUvhOxXfU7qA2FyXeO+kaN5KAoAt5fVKvy+nn?= =?us-ascii?Q?rH5ytShpUYANL3xoiBLwf6xiIF1754x3PDpIYJl0Xf6ZRzSgy78l5uo6hCBY?= =?us-ascii?Q?I9q283s2iwXXe8kaBri0vZvGg6/a1AkeASsbi+XYJaxbDES1HPhMze4y8fNe?= =?us-ascii?Q?62R3PtJEcSfEyjFtpPQAMM03Zj40jwNzeN+nsGLFfBTMd8jqydu7Ht7QnIFq?= =?us-ascii?Q?myM0nOj1gThhhzv5jnFNmn/JMcyGJbjEoRb6bs1VMO1JZRDHq3uMP4+u2F74?= =?us-ascii?Q?9OmRcc/7X4brZqpSA0/EdUTtiRVU8u+R3Ra80gsxHK+BHuOQpONU+xRgvOeP?= =?us-ascii?Q?jnHT1qnVo37ptNiWCnNgqkXy8jFHuBkYsMUYywwHP5v6EKOCb9XfzbV/Xwj+?= =?us-ascii?Q?q8Q8DarZ8WgfNqwMqk57ydqOzUeOmKI6c9Zx/9j90LYz09teg9usmQqzgTuV?= =?us-ascii?Q?61t7l3DKWpPaSyhDXIhZmrj3Zm/lzZbwwQ6IKRaUA9x0+1rKXT0HhZe+GpHV?= =?us-ascii?Q?9nLxdp0T+6igygL4eAFkUS8h?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5ddb34c1-3f8e-4b51-7bf8-08da9260df25 X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2022 12:43:18.6980 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2P250MB0286 Subject: [FFmpeg-devel] [PATCH 2/2] Revert "avcodec/loongarch/h264chroma, vc1dsp_lasx: Add wrapper for __lasx_xvldx" X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: This reverts commit 2c8dc7e953e532752500e8145aa1ceee908bda2f. The loongarch headers have been fixed, so that this wrapper is no longer necessary. Signed-off-by: Andreas Rheinhardt --- libavcodec/loongarch/h264chroma_lasx.c | 90 +++++++++++------------ libavcodec/loongarch/vc1dsp_lasx.c | 16 ++-- libavutil/loongarch/loongson_intrinsics.h | 5 -- 3 files changed, 53 insertions(+), 58 deletions(-) diff --git a/libavcodec/loongarch/h264chroma_lasx.c b/libavcodec/loongarch/h264chroma_lasx.c index 5e611997f4..1c0e002bdf 100644 --- a/libavcodec/loongarch/h264chroma_lasx.c +++ b/libavcodec/loongarch/h264chroma_lasx.c @@ -51,7 +51,7 @@ static av_always_inline void avc_chroma_hv_8x4_lasx(const uint8_t *src, uint8_t __m256i coeff_vt_vec1 = __lasx_xvreplgr2vr_h(coef_ver1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 0, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, src1, src2, src3, src4); DUP2_ARG3(__lasx_xvpermi_q, src2, src1, 0x20, src4, src3, 0x20, src1, src3); src0 = __lasx_xvshuf_b(src0, src0, mask); @@ -91,10 +91,10 @@ static av_always_inline void avc_chroma_hv_8x8_lasx(const uint8_t *src, uint8_t __m256i coeff_vt_vec1 = __lasx_xvreplgr2vr_h(coef_ver1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 0, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, src1, src2, src3, src4); src += stride_4x; - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, src5, src6, src7, src8); DUP4_ARG3(__lasx_xvpermi_q, src2, src1, 0x20, src4, src3, 0x20, src6, src5, 0x20, src8, src7, 0x20, src1, src3, src5, src7); @@ -141,8 +141,8 @@ static av_always_inline void avc_chroma_hz_8x4_lasx(const uint8_t *src, uint8_t coeff_vec = __lasx_xvslli_b(coeff_vec, 3); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 0, src, 0, mask, src0); - DUP2_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src1, src2); - src3 = LASX_XVLDX(src, stride_3x); + DUP2_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src1, src2); + src3 = __lasx_xvldx(src, stride_3x); DUP2_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src3, src2, 0x20, src0, src2); DUP2_ARG3(__lasx_xvshuf_b, src0, src0, mask, src2, src2, mask, src0, src2); DUP2_ARG2(__lasx_xvdp2_h_bu, src0, coeff_vec, src2, coeff_vec, res0, res1); @@ -170,11 +170,11 @@ static av_always_inline void avc_chroma_hz_8x8_lasx(const uint8_t *src, uint8_t coeff_vec = __lasx_xvslli_b(coeff_vec, 3); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 0, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, src1, src2, src3, src4); src += stride_4x; - DUP2_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src5, src6); - src7 = LASX_XVLDX(src, stride_3x); + DUP2_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src5, src6); + src7 = __lasx_xvldx(src, stride_3x); DUP4_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src3, src2, 0x20, src5, src4, 0x20, src7, src6, 0x20, src0, src2, src4, src6); DUP4_ARG3(__lasx_xvshuf_b, src0, src0, mask, src2, src2, mask, src4, src4, mask, @@ -212,7 +212,7 @@ static av_always_inline void avc_chroma_hz_nonmult_lasx(const uint8_t *src, coeff_vec = __lasx_xvslli_b(coeff_vec, 3); for (row = height >> 2; row--;) { - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src0, src1, src2, src3); src += stride_4x; DUP2_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src3, src2, 0x20, src0, src2); @@ -228,7 +228,7 @@ static av_always_inline void avc_chroma_hz_nonmult_lasx(const uint8_t *src, if ((height & 3)) { src0 = __lasx_xvld(src, 0); - src1 = LASX_XVLDX(src, stride); + src1 = __lasx_xvldx(src, stride); src1 = __lasx_xvpermi_q(src1, src0, 0x20); src0 = __lasx_xvshuf_b(src1, src1, mask); res0 = __lasx_xvdp2_h_bu(src0, coeff_vec); @@ -253,7 +253,7 @@ static av_always_inline void avc_chroma_vt_8x4_lasx(const uint8_t *src, uint8_t coeff_vec = __lasx_xvslli_b(coeff_vec, 3); src0 = __lasx_xvld(src, 0); src += stride; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src1, src2, src3, src4); DUP4_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src2, src1, 0x20, src3, src2, 0x20, src4, src3, 0x20, src0, src1, src2, src3); @@ -282,10 +282,10 @@ static av_always_inline void avc_chroma_vt_8x8_lasx(const uint8_t *src, uint8_t coeff_vec = __lasx_xvslli_b(coeff_vec, 3); src0 = __lasx_xvld(src, 0); src += stride; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src1, src2, src3, src4); src += stride_4x; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src5, src6, src7, src8); DUP4_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src2, src1, 0x20, src3, src2, 0x20, src4, src3, 0x20, src0, src1, src2, src3); @@ -402,7 +402,7 @@ static void avc_chroma_hv_4x2_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s __m256i coeff_vt_vec = __lasx_xvpermi_q(coeff_vt_vec1, coeff_vt_vec0, 0x02); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 32, src, 0, mask, src0); - DUP2_ARG2(LASX_XVLDX, src, stride, src, stride_2, src1, src2); + DUP2_ARG2(__lasx_xvldx, src, stride, src, stride_2, src1, src2); DUP2_ARG3(__lasx_xvshuf_b, src1, src0, mask, src2, src1, mask, src0, src1); src0 = __lasx_xvpermi_q(src0, src1, 0x02); res_hz = __lasx_xvdp2_h_bu(src0, coeff_hz_vec); @@ -431,7 +431,7 @@ static void avc_chroma_hv_4x4_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s __m256i coeff_vt_vec1 = __lasx_xvreplgr2vr_h(coef_ver1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 32, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src1, src2, src3, src4); DUP4_ARG3(__lasx_xvshuf_b, src1, src0, mask, src2, src1, mask, src3, src2, mask, src4, src3, mask, src0, src1, src2, src3); @@ -464,10 +464,10 @@ static void avc_chroma_hv_4x8_lasx(const uint8_t *src, uint8_t * dst, ptrdiff_t __m256i coeff_vt_vec1 = __lasx_xvreplgr2vr_h(coef_ver1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 32, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src1, src2, src3, src4); src += stride_4; - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src5, src6, src7, src8); DUP4_ARG3(__lasx_xvshuf_b, src1, src0, mask, src2, src1, mask, src3, src2, mask, src4, src3, mask, src0, src1, src2, src3); @@ -519,7 +519,7 @@ static void avc_chroma_hz_4x2_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s __m256i coeff_vec = __lasx_xvilvl_b(coeff_vec0, coeff_vec1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 32, src, 0, mask, src0); - src1 = LASX_XVLDX(src, stride); + src1 = __lasx_xvldx(src, stride); src0 = __lasx_xvshuf_b(src1, src0, mask); res = __lasx_xvdp2_h_bu(src0, coeff_vec); res = __lasx_xvslli_h(res, 3); @@ -540,8 +540,8 @@ static void avc_chroma_hz_4x4_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s __m256i coeff_vec = __lasx_xvilvl_b(coeff_vec0, coeff_vec1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 32, src, 0, mask, src0); - DUP2_ARG2(LASX_XVLDX, src, stride, src, stride_2, src1, src2); - src3 = LASX_XVLDX(src, stride_3); + DUP2_ARG2(__lasx_xvldx, src, stride, src, stride_2, src1, src2); + src3 = __lasx_xvldx(src, stride_3); DUP2_ARG3(__lasx_xvshuf_b, src1, src0, mask, src3, src2, mask, src0, src2); src0 = __lasx_xvpermi_q(src0, src2, 0x02); res = __lasx_xvdp2_h_bu(src0, coeff_vec); @@ -567,11 +567,11 @@ static void avc_chroma_hz_4x8_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s coeff_vec = __lasx_xvslli_b(coeff_vec, 3); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 32, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src1, src2, src3, src4); src += stride_4; - DUP2_ARG2(LASX_XVLDX, src, stride, src, stride_2, src5, src6); - src7 = LASX_XVLDX(src, stride_3); + DUP2_ARG2(__lasx_xvldx, src, stride, src, stride_2, src5, src6); + src7 = __lasx_xvldx(src, stride_3); DUP4_ARG3(__lasx_xvshuf_b, src1, src0, mask, src3, src2, mask, src5, src4, mask, src7, src6, mask, src0, src2, src4, src6); DUP2_ARG3(__lasx_xvpermi_q, src0, src2, 0x02, src4, src6, 0x02, src0, src4); @@ -625,7 +625,7 @@ static void avc_chroma_vt_4x2_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s __m256i coeff_vec = __lasx_xvilvl_b(coeff_vec0, coeff_vec1); src0 = __lasx_xvld(src, 0); - DUP2_ARG2(LASX_XVLDX, src, stride, src, stride << 1, src1, src2); + DUP2_ARG2(__lasx_xvldx, src, stride, src, stride << 1, src1, src2); DUP2_ARG2(__lasx_xvilvl_b, src1, src0, src2, src1, tmp0, tmp1); tmp0 = __lasx_xvilvl_d(tmp1, tmp0); res = __lasx_xvdp2_h_bu(tmp0, coeff_vec); @@ -649,7 +649,7 @@ static void avc_chroma_vt_4x4_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s __m256i coeff_vec = __lasx_xvilvl_b(coeff_vec0, coeff_vec1); src0 = __lasx_xvld(src, 0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src1, src2, src3, src4); DUP4_ARG2(__lasx_xvilvl_b, src1, src0, src2, src1, src3, src2, src4, src3, tmp0, tmp1, tmp2, tmp3); @@ -679,10 +679,10 @@ static void avc_chroma_vt_4x8_lasx(const uint8_t *src, uint8_t *dst, ptrdiff_t s coeff_vec = __lasx_xvslli_b(coeff_vec, 3); src0 = __lasx_xvld(src, 0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src1, src2, src3, src4); src += stride_4; - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2, src, stride_3, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2, src, stride_3, src, stride_4, src5, src6, src7, src8); DUP4_ARG2(__lasx_xvilvl_b, src1, src0, src2, src1, src3, src2, src4, src3, tmp0, tmp1, tmp2, tmp3); @@ -860,7 +860,7 @@ static av_always_inline void avc_chroma_hv_and_aver_dst_8x4_lasx(const uint8_t * __m256i coeff_vt_vec1 = __lasx_xvreplgr2vr_h(coef_ver1); DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 0, src, 0, mask, src0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, src1, src2, src3, src4); DUP2_ARG3(__lasx_xvpermi_q, src2, src1, 0x20, src4, src3, 0x20, src1, src3); src0 = __lasx_xvshuf_b(src0, src0, mask); @@ -874,7 +874,7 @@ static av_always_inline void avc_chroma_hv_and_aver_dst_8x4_lasx(const uint8_t * res_vt0 = __lasx_xvmadd_h(res_vt0, res_hz0, coeff_vt_vec1); res_vt1 = __lasx_xvmadd_h(res_vt1, res_hz1, coeff_vt_vec1); out = __lasx_xvssrarni_bu_h(res_vt1, res_vt0, 6); - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); tp0 = __lasx_xvpermi_q(tp2, tp0, 0x20); @@ -907,10 +907,10 @@ static av_always_inline void avc_chroma_hv_and_aver_dst_8x8_lasx(const uint8_t * DUP2_ARG2(__lasx_xvld, chroma_mask_arr, 0, src, 0, mask, src0); src += stride; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src1, src2, src3, src4); src += stride_4x; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src5, src6, src7, src8); DUP4_ARG3(__lasx_xvpermi_q, src2, src1, 0x20, src4, src3, 0x20, src6, src5, 0x20, src8, src7, 0x20, src1, src3, src5, src7); @@ -934,12 +934,12 @@ static av_always_inline void avc_chroma_hv_and_aver_dst_8x8_lasx(const uint8_t * res_vt3 = __lasx_xvmadd_h(res_vt3, res_hz3, coeff_vt_vec1); DUP2_ARG3(__lasx_xvssrarni_bu_h, res_vt1, res_vt0, 6, res_vt3, res_vt2, 6, out0, out1); - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); dst0 = __lasx_xvpermi_q(tp2, tp0, 0x20); dst += stride_4x; - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); dst -= stride_4x; DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); @@ -973,13 +973,13 @@ static av_always_inline void avc_chroma_hz_and_aver_dst_8x4_lasx(const uint8_t * coeff_vec = __lasx_xvslli_b(coeff_vec, 3); mask = __lasx_xvld(chroma_mask_arr, 0); - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src0, src1, src2, src3); DUP2_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src3, src2, 0x20, src0, src2); DUP2_ARG3(__lasx_xvshuf_b, src0, src0, mask, src2, src2, mask, src0, src2); DUP2_ARG2(__lasx_xvdp2_h_bu, src0, coeff_vec, src2, coeff_vec, res0, res1); out = __lasx_xvssrarni_bu_h(res1, res0, 6); - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); tp0 = __lasx_xvpermi_q(tp2, tp0, 0x20); @@ -1008,10 +1008,10 @@ static av_always_inline void avc_chroma_hz_and_aver_dst_8x8_lasx(const uint8_t * coeff_vec = __lasx_xvslli_b(coeff_vec, 3); mask = __lasx_xvld(chroma_mask_arr, 0); - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src0, src1, src2, src3); src += stride_4x; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src4, src5, src6, src7); DUP4_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src3, src2, 0x20, src5, src4, 0x20, src7, src6, 0x20, src0, src2, src4, src6); @@ -1020,12 +1020,12 @@ static av_always_inline void avc_chroma_hz_and_aver_dst_8x8_lasx(const uint8_t * DUP4_ARG2(__lasx_xvdp2_h_bu, src0, coeff_vec, src2, coeff_vec, src4, coeff_vec, src6, coeff_vec, res0, res1, res2, res3); DUP2_ARG3(__lasx_xvssrarni_bu_h, res1, res0, 6, res3, res2, 6, out0, out1); - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); dst0 = __lasx_xvpermi_q(tp2, tp0, 0x20); dst += stride_4x; - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); dst -= stride_4x; DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); @@ -1059,14 +1059,14 @@ static av_always_inline void avc_chroma_vt_and_aver_dst_8x4_lasx(const uint8_t * coeff_vec = __lasx_xvslli_b(coeff_vec, 3); src0 = __lasx_xvld(src, 0); - DUP4_ARG2(LASX_XVLDX, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, + DUP4_ARG2(__lasx_xvldx, src, stride, src, stride_2x, src, stride_3x, src, stride_4x, src1, src2, src3, src4); DUP4_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src2, src1, 0x20, src3, src2, 0x20, src4, src3, 0x20, src0, src1, src2, src3); DUP2_ARG2(__lasx_xvilvl_b, src1, src0, src3, src2, src0, src2); DUP2_ARG2(__lasx_xvdp2_h_bu, src0, coeff_vec, src2, coeff_vec, res0, res1); out = __lasx_xvssrarni_bu_h(res1, res0, 6); - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); tp0 = __lasx_xvpermi_q(tp2, tp0, 0x20); @@ -1095,10 +1095,10 @@ static av_always_inline void avc_chroma_vt_and_aver_dst_8x8_lasx(const uint8_t * coeff_vec = __lasx_xvslli_b(coeff_vec, 3); src0 = __lasx_xvld(src, 0); src += stride; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src1, src2, src3, src4); src += stride_4x; - DUP4_ARG2(LASX_XVLDX, src, 0, src, stride, src, stride_2x, src, stride_3x, + DUP4_ARG2(__lasx_xvldx, src, 0, src, stride, src, stride_2x, src, stride_3x, src5, src6, src7, src8); DUP4_ARG3(__lasx_xvpermi_q, src1, src0, 0x20, src2, src1, 0x20, src3, src2, 0x20, src4, src3, 0x20, src0, src1, src2, src3); @@ -1109,12 +1109,12 @@ static av_always_inline void avc_chroma_vt_and_aver_dst_8x8_lasx(const uint8_t * DUP4_ARG2(__lasx_xvdp2_h_bu, src0, coeff_vec, src2, coeff_vec, src4, coeff_vec, src6, coeff_vec, res0, res1, res2, res3); DUP2_ARG3(__lasx_xvssrarni_bu_h, res1, res0, 6, res3, res2, 6, out0, out1); - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); dst0 = __lasx_xvpermi_q(tp2, tp0, 0x20); dst += stride_4x; - DUP4_ARG2(LASX_XVLDX, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, + DUP4_ARG2(__lasx_xvldx, dst, 0, dst, stride, dst, stride_2x, dst, stride_3x, tp0, tp1, tp2, tp3); dst -= stride_4x; DUP2_ARG2(__lasx_xvilvl_d, tp2, tp0, tp3, tp1, tp0, tp2); diff --git a/libavcodec/loongarch/vc1dsp_lasx.c b/libavcodec/loongarch/vc1dsp_lasx.c index 12f68ee028..848fe4afb3 100644 --- a/libavcodec/loongarch/vc1dsp_lasx.c +++ b/libavcodec/loongarch/vc1dsp_lasx.c @@ -831,20 +831,20 @@ static void put_vc1_mspel_mc_h_lasx(uint8_t *dst, const uint8_t *src, const_para1_2 = __lasx_xvreplgr2vr_h(*(para_v + 1)); in0 = __lasx_xvld(_src, 0); - DUP2_ARG2(LASX_XVLDX, _src, stride, _src, stride2, in1, in2); - in3 = LASX_XVLDX(_src, stride3); + DUP2_ARG2(__lasx_xvldx, _src, stride, _src, stride2, in1, in2); + in3 = __lasx_xvldx(_src, stride3); _src += stride4; in4 = __lasx_xvld(_src, 0); - DUP2_ARG2(LASX_XVLDX, _src, stride, _src, stride2, in5, in6); - in7 = LASX_XVLDX(_src, stride3); + DUP2_ARG2(__lasx_xvldx, _src, stride, _src, stride2, in5, in6); + in7 = __lasx_xvldx(_src, stride3); _src += stride4; in8 = __lasx_xvld(_src, 0); - DUP2_ARG2(LASX_XVLDX, _src, stride, _src, stride2, in9, in10); - in11 = LASX_XVLDX(_src, stride3); + DUP2_ARG2(__lasx_xvldx, _src, stride, _src, stride2, in9, in10); + in11 = __lasx_xvldx(_src, stride3); _src += stride4; in12 = __lasx_xvld(_src, 0); - DUP2_ARG2(LASX_XVLDX, _src, stride, _src, stride2, in13, in14); - in15 = LASX_XVLDX(_src, stride3); + DUP2_ARG2(__lasx_xvldx, _src, stride, _src, stride2, in13, in14); + in15 = __lasx_xvldx(_src, stride3); DUP4_ARG2(__lasx_xvilvl_b, in2, in0, in3, in1, in6, in4, in7, in5, tmp0_m, tmp1_m, tmp2_m, tmp3_m); DUP4_ARG2(__lasx_xvilvl_b, in10, in8, in11, in9, in14, in12, in15, in13, diff --git a/libavutil/loongarch/loongson_intrinsics.h b/libavutil/loongarch/loongson_intrinsics.h index 090adab266..eb256863c8 100644 --- a/libavutil/loongarch/loongson_intrinsics.h +++ b/libavutil/loongarch/loongson_intrinsics.h @@ -716,11 +716,6 @@ static inline __m128i __lsx_vclip255_w(__m128i _in) { #ifdef __loongarch_asx #include - -/* __lasx_xvldx() in lasxintrin.h does not accept a const void*; - * remove the following once it does. */ -#define LASX_XVLDX(ptr, stride) __lasx_xvldx((void*)ptr, stride) - /* * ============================================================================= * Description : Dot product of byte vector elements -- 2.34.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".