From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTP id 7C377487F6 for ; Fri, 19 Jan 2024 13:39:48 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9357C68D0AA; Fri, 19 Jan 2024 15:39:29 +0200 (EET) Received: from JPN01-OS0-obe.outbound.protection.outlook.com (mail-os0jpn01olkn2102.outbound.protection.outlook.com [40.92.98.102]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6C28B68CB22 for ; Fri, 19 Jan 2024 15:39:27 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Chv63ner7FjyrQo55AtR/1SmLmuOmEXBLbANxHQxJsxO/XQgJj6kW8ogrf67APs9Fhy84P0EUwxM041e/mekkiTgZjdVfIPHLJ0GiG5wfNyzAN2UrbegTTiptsLhDcMYC/HhMbvK+Wz7xYyhrO60xrr0SfJBtZuOczA5ET4Z34hpRQSeErhZp4wN5JhCOfAsN25lapwul6eJ2GT3P+ukdiECHWXttrZpve0oYttaaayEM+ZBMQsJKq4EizsKe083fOKWmTJdmGd5cG1NUT1+dRuLfm6Gfmog3s/YkQgLtS6bxdJusOMShyoDllf3faetCwJlPwl8KONq6UInhNT65w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=T88sKNaMe5BWFOlGSJyKxiENeJJAoTPPZuS519VY6og=; b=ITpkY0jxiVg3AhNQFRu34F9iCNgl8oCdwdEcafTslgsE5/ZaC8qjyyhqdLm+/rRkGjhyE6nwCneerxq/V9eccQR2CBw6/eVPyvLOUwIVuk9jOWL0P5y0cRrHpPg+TXBCidQM1mkH8ZhAwfoOgNj/TjMFkD8UZ73TlEeV8+ZZV0zJ+ZFaiRTxCQCVEQLhYApETpL5VxIw7Zpig/RjLQI4Ua192ARWmxLOC2cy3uod/TJKwRgRh6NtpE3qqTz2Tx2Qcmyy2LXqs3E1BD1hSAW1KPSvp819r6mM3H88Bw4VwolvOWobhysjTcFKM5j888OyoYYXjI4MfFNPqv53r7w+zQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=T88sKNaMe5BWFOlGSJyKxiENeJJAoTPPZuS519VY6og=; b=AvQdshmJuYcXWtfxFd/2c7K61gCJcevk0LaMJYfkKxsz7uiPOrMD3S+MLyB3LPisMGmFdG/ilOvhLBbMNp/rw29V4/X5eY3wrDDQagDPtDVBjj5WNlSSLv3w+tNlVSO6MhIfiW4PdzkZLNJlJxrHVGQlTuWGebplQ5tEmrIQss6++oQz1+kyTN8OX2mYxVpG3F1ugWOw/RXbS+7ArlYOPKYA1J17Gh72HO7QfgSucfvLMP9EWgt7nA2zo0pkN0+na4PnWwC+CoVfM7xr+ayossH/0v1JK331UTQ78gRdS6uqXgSofzqD7ySatloehLobe0pluDRo6yb+0vRbdlqCbg== Received: from OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:186::5) by OS3P286MB2523.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:1ef::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7202.24; Fri, 19 Jan 2024 13:39:18 +0000 Received: from OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM ([fe80::1bbf:406b:216:f56e]) by OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM ([fe80::1bbf:406b:216:f56e%7]) with mapi id 15.20.7202.024; Fri, 19 Jan 2024 13:39:18 +0000 From: toqsxw@outlook.com To: ffmpeg-devel@ffmpeg.org Date: Fri, 19 Jan 2024 21:38:16 +0800 Message-ID: X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240119133820.1048730-1-toqsxw@outlook.com> References: <20240119133820.1048730-1-toqsxw@outlook.com> X-TMN: [fZdrm3+/xmDe8c0Oe3Cuc4Ofc3rtKSK/] X-ClientProxiedBy: SG2PR02CA0137.apcprd02.prod.outlook.com (2603:1096:4:188::17) To OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:186::5) X-Microsoft-Original-Message-ID: <20240119133820.1048730-4-toqsxw@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: OSZP286MB2173:EE_|OS3P286MB2523:EE_ X-MS-Office365-Filtering-Correlation-Id: 1be8fa03-0af7-4c12-3da1-08dc18f4091d X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pwG9QrwwGNahuoN2wSxqlmeu5Rd3/GG6gi+3zcuGVZD9smwr8ae53k+RLcyxpX3pMLAxMyBIVkb7nfXycpR32i7iqihcrKgbQVyM6jAqKUmAFRewCoOlcfhpCx7RelR8ozylJi+kvDwZYSOKy6iKMUKFeQia9/b56I31tyI45JofAaW6sFAVYyaCKy5HXqX20+jtdgPXB/qp9BRCIbYKVZx2lE1u0dRd/N69mY3VZDXT9XaxkIHu7BJeziK5NBEYNnsyZkQbArM0VPyE4bAQjPynOOPrDNHIh0lJkotFpB77niuotj/0EiHHy2kkS9Z+aVeCr3V1LgP4tZe7NVNK4oG2+u2cNkhrVdBVnzgEwegQqLEp/+GP9t3s+O0OQ3iXuKy4qFgiv3NxsMzcMzM5ZRkFf6kQ+7aD5HnfSz+19XtdX2WLkrcYH3Q/giTGpMMAHe+reZEadWlqvSKCRa0+K9L9VnkzBgB2sTjNsEDEcJRKJMevUdFbc1xOfu2gKwAEfcWBALQDWkXF2v9Pdf/y1RdCxtOFh6bePbGOclcDTvWy3rzW5sDX7yMg1WhariMVg5JXEzED0mujLpF3p6Dw4FSfTr5Yca0ImhWinpP69YTG+tYcAUL731CPr42a4DJ2 X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?OEz9sQJR+tvoy4HzLJbN2JFGKrE67/7M7A4GVtAgY4aPPB2PYIwQT7UtsUdf?= =?us-ascii?Q?rva+uNmDL/vsZmmNndHzIU/2gjtuzkALRunhy7e99PsU66GtIqmOv/cRQMr/?= =?us-ascii?Q?xJu6/elehoeGplrSIYX5jEPFFVW9RXbsBVFBpdodKmx/cI0kyLR33NaVOZJP?= =?us-ascii?Q?i+xXd1tXyZbzPQajggGC3gpWg+ExzuJNZiXPpM8B4DfYcjjPQtcpLYAQGiUJ?= =?us-ascii?Q?eBL9Glr8e9CWqX1BJdaCRjPm5eIqyYegIgx2lTdqsXl292yR2Appg2d+5OtX?= =?us-ascii?Q?1LVbA2Qxl2QaecX+TvYyCOK20mzjbdhVKO1pfEpCIFiRH+YPUCT1IB+RwPk8?= =?us-ascii?Q?u92gz2BHtryrIl8VHV4DHTbU+u+iWHtlwR19CMjf6+jOlEh1lglHpTDEIQjs?= =?us-ascii?Q?nk9qhieEPZeiX1SgrjE4n+hM2yWSySB/HxfMvl5wRrFOQWmkEKucWkJII/aF?= =?us-ascii?Q?zqhuIScPi+/SGrx9LTr91XMGwtWid79vzYAo42NXdsNtF7CpAFFwCsRnvOSA?= =?us-ascii?Q?4k9fGyv6R980hOPEajlPQFAXgBBXONtQa97d4I9sRUjE2H3U2hKyomnw7031?= =?us-ascii?Q?l3ibRv5EsHiwHGMQVu3KgQTp77a98o+zBG+SAVL7KRTyNH+AiEq/B4wTV4sy?= =?us-ascii?Q?40PyDuThPLcBKUi+0umXvTqFLw/+TodrAzPUnbZAFq+U7cTQ92gU73S4e1tW?= =?us-ascii?Q?Ut+Q2KNYUhE9RMf/caA0dJsnZaUnizhM2hc5vRTeHAwXgYTFjwqD0+ReTQ3f?= =?us-ascii?Q?eFT1WlgtPziNSiWaFi3bphaFNtuk/3IKgf+jUmQ0o8+PYB/CjZZvYBlpzpHM?= =?us-ascii?Q?03QESt2BdcOuZeZAhy7JWc6s7QKTxL1AJvtwJWZX0A+ygO+w0HybTsmbmEJU?= =?us-ascii?Q?WnpV0uuPgBCRXubPiCv+pdUU5Pz8OmWn/gqbhIvIu5yBg4qFXv1HqRmBwqyY?= =?us-ascii?Q?cLQpBxP8lk9VRT8XqrW4sQOh29sjGp3z6CYCaCKqlcA1q+aHMu00z94QBXJ/?= =?us-ascii?Q?sBGRQK1hejRmIad3Y+z94x8TB2g/eaIfLb+u9T+OuNJHjbTek58cAilySzqm?= =?us-ascii?Q?plIMX06noLiY5QjBtWhZtLTi3vJ+FRouYUXBNSyKwo1XE92DTjRt9dslrOvM?= =?us-ascii?Q?dlPaw5/4F28DAttR9MUmRhVkn5Rd9S5EAMc9xv88+rD6JrHZEh83WDtU9Mxy?= =?us-ascii?Q?9z5snRx4qbxB/HulQswVj+XYg+WzXx1lclratWTFp/Koqpt1XRsegaM83og?= =?us-ascii?Q?=3D?= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1be8fa03-0af7-4c12-3da1-08dc18f4091d X-MS-Exchange-CrossTenant-AuthSource: OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Jan 2024 13:39:18.8700 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: OS3P286MB2523 Subject: [FFmpeg-devel] [PATCH v2 4/8] avcodec/x86/h26x/h2656_inter: add dststride to put X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Wu Jianhua Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: From: Wu Jianhua Signed-off-by: Wu Jianhua --- libavcodec/x86/h26x/h2656_inter.asm | 32 ++++++++++++++--------------- libavcodec/x86/h26x/h2656dsp.c | 4 ++-- libavcodec/x86/h26x/h2656dsp.h | 2 +- libavcodec/x86/hevcdsp_init.c | 2 +- 4 files changed, 19 insertions(+), 21 deletions(-) diff --git a/libavcodec/x86/h26x/h2656_inter.asm b/libavcodec/x86/h26x/h2656_inter.asm index 4316c8ae3d..68f88832a6 100644 --- a/libavcodec/x86/h26x/h2656_inter.asm +++ b/libavcodec/x86/h26x/h2656_inter.asm @@ -22,8 +22,6 @@ ; */ %include "libavutil/x86/x86util.asm" -%define MAX_PB_SIZE 64 - SECTION_RODATA 32 cextern pw_255 cextern pw_512 @@ -332,7 +330,7 @@ SECTION .text %endmacro %macro LOOP_END 3 - add %1q, 2*MAX_PB_SIZE ; dst += dststride + add %1q, dststrideq ; dst += dststride add %2q, %3q ; src += srcstride dec heightd ; cmp height jnz .loop ; height loop @@ -529,7 +527,7 @@ SECTION .text ; ****************************** -; void %1_put_pixels(int16_t *dst, const uint8_t *_src, ptrdiff_t srcstride, +; void %1_put_pixels(int16_t *dst, ptrdiff_t dststride, const uint8_t *_src, ptrdiff_t srcstride, ; int height, const int8_t *hf, const int8_t *vf, int width) ; ****************************** @@ -539,7 +537,7 @@ SECTION .text %endmacro %macro MC_PIXELS 3 -cglobal %1_put_pixels%2_%3, 4, 4, 3, dst, src, srcstride, height +cglobal %1_put_pixels%2_%3, 5, 5, 3, dst, dststride, src, srcstride, height pxor m2, m2 .loop: SIMPLE_LOAD %2, %3, srcq, m0 @@ -569,10 +567,10 @@ cglobal %1_put_uni_pixels%2_%3, 5, 5, 2, dst, dststride, src, srcstride, height %endif ; ****************************** -; void %1_put_4tap_hX(int16_t *dst, +; void %1_put_4tap_hX(int16_t *dst, ptrdiff_t dststride, ; const uint8_t *_src, ptrdiff_t _srcstride, int height, int8_t *hf, int8_t *vf, int width); ; ****************************** -cglobal %1_put_4tap_h%2_%3, 5, 5, XMM_REGS, dst, src, srcstride, height, hf +cglobal %1_put_4tap_h%2_%3, 6, 6, XMM_REGS, dst, dststride, src, srcstride, height, hf %assign %%stride ((%3 + 7)/8) MC_4TAP_FILTER %3, hf, m4, m5 .loop: @@ -602,10 +600,10 @@ cglobal %1_put_uni_4tap_h%2_%3, 6, 7, XMM_REGS, dst, dststride, src, srcstride, RET ; ****************************** -; void %1_put_4tap_v(int16_t *dst, +; void %1_put_4tap_v(int16_t *dst, ptrdiff_t dststride, ; const uint8_t *_src, ptrdiff_t _srcstride, int height, int8_t *hf, int8_t *vf, int width) ; ****************************** -cglobal %1_put_4tap_v%2_%3, 6, 6, XMM_REGS, dst, src, srcstride, height, r3src, vf +cglobal %1_put_4tap_v%2_%3, 7, 7, XMM_REGS, dst, dststride, src, srcstride, height, r3src, vf sub srcq, srcstrideq MC_4TAP_FILTER %3, vf, m4, m5 lea r3srcq, [srcstrideq*3] @@ -639,10 +637,10 @@ cglobal %1_put_uni_4tap_v%2_%3, 7, 7, XMM_REGS, dst, dststride, src, srcstride, %macro PUT_4TAP_HV 3 ; ****************************** -; void put_4tap_hv(int16_t *dst, +; void put_4tap_hv(int16_t *dst, ptrdiff_t dststride, ; const uint8_t *_src, ptrdiff_t _srcstride, int height, int8_t *hf, int8_t *vf, int width) ; ****************************** -cglobal %1_put_4tap_hv%2_%3, 6, 7, 16 , dst, src, srcstride, height, hf, vf, r3src +cglobal %1_put_4tap_hv%2_%3, 7, 8, 16 , dst, dststride, src, srcstride, height, hf, vf, r3src %assign %%stride ((%3 + 7)/8) sub srcq, srcstrideq MC_4TAP_HV_FILTER %3 @@ -774,12 +772,12 @@ cglobal %1_put_uni_4tap_hv%2_%3, 7, 8, 16 , dst, dststride, src, srcstride, heig %endmacro ; ****************************** -; void put_8tap_hX_X_X(int16_t *dst, const uint8_t *_src, ptrdiff_t srcstride, +; void put_8tap_hX_X_X(int16_t *dst, ptrdiff_t dststride, const uint8_t *_src, ptrdiff_t srcstride, ; int height, const int8_t *hf, const int8_t *vf, int width) ; ****************************** %macro PUT_8TAP 3 -cglobal %1_put_8tap_h%2_%3, 5, 5, 16, dst, src, srcstride, height, hf +cglobal %1_put_8tap_h%2_%3, 6, 6, 16, dst, dststride, src, srcstride, height, hf MC_8TAP_FILTER %3, hf .loop: MC_8TAP_H_LOAD %3, srcq, %2, 10 @@ -814,10 +812,10 @@ cglobal %1_put_uni_8tap_h%2_%3, 6, 7, 16 , dst, dststride, src, srcstride, heigh ; ****************************** -; void put_8tap_vX_X_X(int16_t *dst, const uint8_t *_src, ptrdiff_t srcstride, +; void put_8tap_vX_X_X(int16_t *dst, ptrdiff_t dststride, const uint8_t *_src, ptrdiff_t srcstride, ; int height, const int8_t *hf, const int8_t *vf, int width) ; ****************************** -cglobal %1_put_8tap_v%2_%3, 6, 8, 16, dst, src, srcstride, height, r3src, vf +cglobal %1_put_8tap_v%2_%3, 7, 8, 16, dst, dststride, src, srcstride, height, r3src, vf MC_8TAP_FILTER %3, vf lea r3srcq, [srcstrideq*3] .loop: @@ -856,11 +854,11 @@ cglobal %1_put_uni_8tap_v%2_%3, 7, 9, 16, dst, dststride, src, srcstride, height ; ****************************** -; void put_8tap_hvX_X(int16_t *dst, const uint8_t *_src, ptrdiff_t srcstride, +; void put_8tap_hvX_X(int16_t *dst, ptrdiff_t dststride, const uint8_t *_src, ptrdiff_t srcstride, ; int height, const int8_t *hf, const int8_t *vf, int width) ; ****************************** %macro PUT_8TAP_HV 3 -cglobal %1_put_8tap_hv%2_%3, 6, 7, 16, 0 - mmsize*16, dst, src, srcstride, height, hf, vf, r3src +cglobal %1_put_8tap_hv%2_%3, 7, 8, 16, 0 - mmsize*16, dst, dststride, src, srcstride, height, hf, vf, r3src MC_8TAP_FILTER %3, hf, 0 lea hfq, [rsp] MC_8TAP_FILTER %3, vf, 8*mmsize diff --git a/libavcodec/x86/h26x/h2656dsp.c b/libavcodec/x86/h26x/h2656dsp.c index 27769f9c55..7ef1234936 100644 --- a/libavcodec/x86/h26x/h2656dsp.c +++ b/libavcodec/x86/h26x/h2656dsp.c @@ -24,7 +24,7 @@ #include "h2656dsp.h" #define mc_rep_func(name, bitd, step, W, opt) \ -void ff_h2656_put_##name##W##_##bitd##_##opt(int16_t *_dst, \ +void ff_h2656_put_##name##W##_##bitd##_##opt(int16_t *_dst, ptrdiff_t dststride, \ const uint8_t *_src, ptrdiff_t _srcstride, int height, const int8_t *hf, const int8_t *vf, int width) \ { \ int i; \ @@ -32,7 +32,7 @@ void ff_h2656_put_##name##W##_##bitd##_##opt(int16_t *_dst, for (i = 0; i < W; i += step) { \ const uint8_t *src = _src + (i * ((bitd + 7) / 8)); \ dst = _dst + i; \ - ff_h2656_put_##name##step##_##bitd##_##opt(dst, src, _srcstride, height, hf, vf, width); \ + ff_h2656_put_##name##step##_##bitd##_##opt(dst, dststride, src, _srcstride, height, hf, vf, width); \ } \ } diff --git a/libavcodec/x86/h26x/h2656dsp.h b/libavcodec/x86/h26x/h2656dsp.h index 8a2ab13607..e31aae6b0d 100644 --- a/libavcodec/x86/h26x/h2656dsp.h +++ b/libavcodec/x86/h26x/h2656dsp.h @@ -30,7 +30,7 @@ #include #define H2656_PEL_PROTOTYPE(name, D, opt) \ -void ff_h2656_put_ ## name ## _ ## D ## _##opt(int16_t *dst, const uint8_t *_src, ptrdiff_t _srcstride, int height, const int8_t *hf, const int8_t *vf, int width); \ +void ff_h2656_put_ ## name ## _ ## D ## _##opt(int16_t *dst, ptrdiff_t dststride, const uint8_t *_src, ptrdiff_t _srcstride, int height, const int8_t *hf, const int8_t *vf, int width); \ void ff_h2656_put_uni_ ## name ## _ ## D ## _##opt(uint8_t *_dst, ptrdiff_t _dststride, const uint8_t *_src, ptrdiff_t _srcstride, int height, const int8_t *hf, const int8_t *vf, int width); \ #define H2656_MC_8TAP_PROTOTYPES(fname, bitd, opt) \ diff --git a/libavcodec/x86/hevcdsp_init.c b/libavcodec/x86/hevcdsp_init.c index 5c19330e19..e0dc82eef0 100644 --- a/libavcodec/x86/hevcdsp_init.c +++ b/libavcodec/x86/hevcdsp_init.c @@ -96,7 +96,7 @@ void ff_hevc_put_hevc_ ## a ## _ ## depth ## _##opt(int16_t *dst, const uint8_t int height, intptr_t mx, intptr_t my,int width) \ { \ DECL_HV_FILTER(p) \ - ff_h2656_put_ ## b ## _ ## depth ## _##opt(dst, src, srcstride, height, hf, vf, width); \ + ff_h2656_put_ ## b ## _ ## depth ## _##opt(dst, 2 * MAX_PB_SIZE, src, srcstride, height, hf, vf, width); \ } #define FW_PUT_UNI(p, a, b, depth, opt) \ -- 2.34.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".