From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 0ECA95001A for ; Mon, 7 Jul 2025 09:47:26 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id F34696914D0; Mon, 7 Jul 2025 12:47:21 +0300 (EEST) To: Date: Mon, 7 Jul 2025 11:45:36 +0200 In-Reply-To: <20250707094536.22157-1-ddesouza@nvidia.com> References: <20250707094536.22157-1-ddesouza@nvidia.com> MIME-Version: 1.0 Message-ID: List-Id: FFmpeg development discussions and patches List-Post: From: Diego Felix de Souza via ffmpeg-devel Precedence: list Cc: ddesouza@nvidia.com X-Mailman-Version: 2.1.29 X-BeenThere: ffmpeg-devel@ffmpeg.org List-Subscribe: , List-Unsubscribe: , List-Archive: Reply-To: FFmpeg development discussions and patches List-Help: Subject: [FFmpeg-devel] [PATCH v2 3/3] avfilter/scale_cuda: Add support for 4:2:2 chroma subsampling Content-Type: multipart/mixed; boundary="===============6034346265839831920==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============6034346265839831920== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2063.outbound.protection.outlook.com [40.107.220.63]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 7033B6914CC for ; Mon, 7 Jul 2025 12:47:20 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=i3PnceIfD02P5UHGiQntenaGXXO74tPlizCxTzHtcggcMDx4ZJfcmo2sILO5Pz49KHhexgy1Fzit84q0zu53MZpHUSRzFoj0Z4U+cFjO1XVIrHk8hoLeA5kM4PA39GByHHt9zJMiZQ/tKP3OG6rEtzp0z/R7vH6JPk84G4gOohXzfcytTbTJTbXk422//4iaBCthJlgiTlnIqUIapzJsj5uaG48M0SfSOdV5ZJyQDJQOD46QAdTKeJi6Fqdr3oxYyo10rQZXSvl0UyRv5MExpNDPwfH33SG0FQy11yjexMEJ28khjF+ADTrV9VZrOfdb0kwfgzXvv05tIVcHGJosyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4HzhdswX43wiMsVJ5izIkvUd/Wcfm5a7RAgLkLFvyro=; b=Ow3u3A8Qz2Md+yHqVfAv8Q22U5GH++LEsZaQBsBDdwIgGUYkJLphJS683Hm+jT3fSm7EtM30eX7D8yM1blufsLNGIwd4rVzWpbfT9eDAxlW6x2JHgdVP7WXLLwKLkyphGRhkz6NZdmD1Kmbu6nJ51eHzj4kMI5hIdzidzOs+04hBI2Uyz5hrgNqmOdTAQdljpwByg4mDgylkM5gey3XfVpyweCl25PTp+nDgBT9wx1CNBXi4TGoPsoLj+bjzh/bRbUqpYa1nkcQ2dLK8DJZG872ZT3ZQIhKIT7Osikcnb9DzXXe+/IadTjZaePB8oHs1PYyESCvCfFu9dyMoZHRKUg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=ffmpeg.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4HzhdswX43wiMsVJ5izIkvUd/Wcfm5a7RAgLkLFvyro=; b=FrGJsS2XF1DHPU6dEXaEHkSqhQNcKXNKV4jGK00gBJQYCZTuMY5ASJVdtmszF97NUOX/YA/Iltln0PQ9qalyvRDJBxrt3J149F4/ibyZRqiVYBzeb0GanjLDC8Gt6sUAOXt0MGUOitpUWs/7jTMKnajv90eUK6jh1jX8sFgd6kEgyRmuDoSEcVrPYlgoRMiVJZSN90uIDNy6njBXZL7Lo6etTrhsb8Ze1pgftF25APRlsfFpS1OUpWbcnrcuaBHJEcrzGbqC+fxe3cxU6QAU97sMeaTMtKXpr+CcwGUzmrTr4LzV8RVZ7BdsI49Je/EfxNPYH91ppNzK9PlpOlO4aQ== Received: from MW2PR2101CA0035.namprd21.prod.outlook.com (2603:10b6:302:1::48) by CH1PPFC596BECF8.namprd12.prod.outlook.com (2603:10b6:61f:fc00::621) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.40; Mon, 7 Jul 2025 09:47:10 +0000 Received: from MWH0EPF000A6733.namprd04.prod.outlook.com (2603:10b6:302:1:cafe::1) by MW2PR2101CA0035.outlook.office365.com (2603:10b6:302:1::48) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8943.4 via Frontend Transport; Mon, 7 Jul 2025 09:47:10 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by MWH0EPF000A6733.mail.protection.outlook.com (10.167.249.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8901.20 via Frontend Transport; Mon, 7 Jul 2025 09:47:09 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 7 Jul 2025 02:46:54 -0700 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Mon, 7 Jul 2025 02:46:52 -0700 From: To: CC: Diego de Souza Subject: [PATCH v2 3/3] avfilter/scale_cuda: Add support for 4:2:2 chroma subsampling Date: Mon, 7 Jul 2025 11:45:36 +0200 Message-ID: <20250707094536.22157-3-ddesouza@nvidia.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250707094536.22157-1-ddesouza@nvidia.com> References: <20250707094536.22157-1-ddesouza@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWH0EPF000A6733:EE_|CH1PPFC596BECF8:EE_ X-MS-Office365-Filtering-Correlation-Id: accbad8e-df2b-4610-1c44-08ddbd3b3e2d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?eEhqe9kVU83awrkfKvZTfrwUAAJxpz95QPPhIU6HjgB9+O2gm06QCAcTem2K?= =?us-ascii?Q?QDPFkc9jQo/uKr0DUJ/PQtCvxPjNK102TOyao4f7bq6ZloXk59/DY8b9CjRI?= =?us-ascii?Q?A8s0xK/Kd0AijWrZvaLdbECjk6UrbF8HrkMS5xUj5zHIrzeCvgdzBt1y+Qq0?= =?us-ascii?Q?oRLF6ELYZ975NmK8dRmePlQ7HcfPx8bvknD5H12Nu3XqfOfXaprSGFKBhSNa?= =?us-ascii?Q?J8XZ1zYgBprbrxjN1fr0fjApv26VWDMOM99R5AviicSJ2E65OgOQ8W12Z7T+?= =?us-ascii?Q?RjtTG5Vc6LTlK0B2ZGr/DMSZNQsL0m2RvjzqBOMNzuyHO/b1JF4jpuhXKKxp?= =?us-ascii?Q?cUqk7ugAWoBxbKkrmb0ZgdBSmm+XRkrOpeEi4iWV/Qk2RdpLhpJvuf/8Qb3f?= =?us-ascii?Q?icoD5vW2ZhbNQoQuyjsdqOrWF9wJdD2OlcqRtwdRVaTjvMfE7Ujb4UgHLeEP?= =?us-ascii?Q?3YSBKgtfFHZV9dwYtAWX1lKrhQjgOqfsmbQJnM93OFdEaKOMrfr/ZJwJRyRx?= =?us-ascii?Q?FtHvorTz9j07u8Syjna0HeqWRd7MLF0MqZn/qrWFgrjl7YZiSXBvtjC0nnKU?= =?us-ascii?Q?FeCeLkWlOVlRWcyqIeMex/MiOkzHf8pfS331kbhENo6gcdk/qjBrb/6sKwce?= =?us-ascii?Q?ackP6St8dWPVKe58ywQ/2XyceI8+16+itzKGB+JR6uXRhVLONarQH4HOPs5M?= =?us-ascii?Q?+weXPYiKQqpIN7NEGzO/isu3Ac2Zp2ajACEf/6Hrr/NOGQjr0WK755V9jtef?= =?us-ascii?Q?OzMjKXOhoP0eB8gxWj554P5WesY4offXgDuLIsN0vzCZJ665isYWgAjFtge+?= =?us-ascii?Q?1/YkoFWRhQBi6I9Q5saaACP9widyFmk9uMt4gQ55t/Hrc3O9pjqDdGlRfAAB?= =?us-ascii?Q?y3uZiIHhvH+bhhVEuVLLh3NLNo/qTHLe5K4RYTB4ESRnJYvH4XKd0ZPeb0EI?= =?us-ascii?Q?50VMKL5QSabrXQTiLTK5JN1FQfKbNkbUpV0BGPUWPp9ZI6IwVqlF+uorQAg/?= =?us-ascii?Q?MFTapi3OkLvTUqKhtgnYUOd48jAOk+7lFJWY8aHzaciJDi0GOmXgvfjJJpFU?= =?us-ascii?Q?chZAWHWnUP1XbBBTYETGPNP0WOo6R23U0x8AwsGsg2fn8JaoplGSnEKmQUoV?= =?us-ascii?Q?CN3B6yRidcVGo5dNCSeZ/oIWw7w+iKzmujqG0j8MXi1t6YqkSZSGiQvQbLr/?= =?us-ascii?Q?k89KtDQuJurtAPDWXY7Oa98QU6mviVB4RTmtkiNWBFPrdQFKHt6twdUgPhnW?= =?us-ascii?Q?c/rx5AZzQwsImNpPzYXDYey7R+NjIaCMHbrXL4jxcLLg2fDlFpxh3eXeo9b6?= =?us-ascii?Q?jK9K9TB8Gi3f58m9HRztO1+hYDJr0Biyg94P0LUt5pYIDhzqbEDODPxxU2an?= =?us-ascii?Q?MaWFhxSvqcFruRQ6BZnmu0WzCXKUDaRX505S0WDhe1wD/T+54jSOvEOc3HxM?= =?us-ascii?Q?5MHAj0ttlJBxsBGB1GLojRLI00T2iSqxkJk0wfKW8S9daN6vwQmhiiomE4gd?= =?us-ascii?Q?GkWuYjqcKB4jtZ+T9ek4kfbPhhAnEtkVypI4?= X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(1800799024)(36860700013)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Jul 2025 09:47:09.9356 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: accbad8e-df2b-4610-1c44-08ddbd3b3e2d X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: MWH0EPF000A6733.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH1PPFC596BECF8 From: Diego de Souza The supported YUV pixel formats were separated between planar and semiplanar. This approach reduces the number of CUDA kernels for all pixel formats. This patch: 1. Adds support for YUV 4:2:2 planar and semi-planar formats: yuv422p, yuv422p10, nv16, p210, p216 2. Implements new conversion structures and kernel definitions for planar and semi-planar formats Signed-off-by: Diego de Souza --- libavfilter/vf_scale_cuda.c | 52 ++- libavfilter/vf_scale_cuda.cu | 637 ++++++++++++++++++----------------- 2 files changed, 368 insertions(+), 321 deletions(-) diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c index 44eef207ca..560f901730 100644 --- a/libavfilter/vf_scale_cuda.c +++ b/libavfilter/vf_scale_cuda.c @@ -39,17 +39,29 @@ #include "cuda/load_helper.h" #include "vf_scale_cuda.h" -static const enum AVPixelFormat supported_formats[] =3D { - AV_PIX_FMT_YUV420P, - AV_PIX_FMT_NV12, - AV_PIX_FMT_YUV444P, - AV_PIX_FMT_P010, - AV_PIX_FMT_P016, - AV_PIX_FMT_YUV444P16, - AV_PIX_FMT_0RGB32, - AV_PIX_FMT_0BGR32, - AV_PIX_FMT_RGB32, - AV_PIX_FMT_BGR32, +struct format_entry { + enum AVPixelFormat format; + const char *name; +}; + +static const struct format_entry supported_formats[] =3D { + {AV_PIX_FMT_YUV420P, "planar8"}, + {AV_PIX_FMT_YUV422P, "planar8"}, + {AV_PIX_FMT_YUV444P, "planar8"}, + {AV_PIX_FMT_YUV420P10,"planar10"}, + {AV_PIX_FMT_YUV422P10,"planar10"}, + {AV_PIX_FMT_YUV444P10,"planar10"}, + {AV_PIX_FMT_YUV444P16,"planar16"}, + {AV_PIX_FMT_NV12, "semiplanar8"}, + {AV_PIX_FMT_NV16, "semiplanar8"}, + {AV_PIX_FMT_P010, "semiplanar10"}, + {AV_PIX_FMT_P210, "semiplanar10"}, + {AV_PIX_FMT_P016, "semiplanar16"}, + {AV_PIX_FMT_P216, "semiplanar16"}, + {AV_PIX_FMT_0RGB32, "bgr0"}, + {AV_PIX_FMT_0BGR32, "rgb0"}, + {AV_PIX_FMT_RGB32, "bgra"}, + {AV_PIX_FMT_BGR32, "rgba"}, }; #define DIV_UP(a, b) ( ((a) + (b) - 1) / (b) ) @@ -187,11 +199,21 @@ static int format_is_supported(enum AVPixelFormat fmt= ) int i; for (i =3D 0; i < FF_ARRAY_ELEMS(supported_formats); i++) - if (supported_formats[i] =3D=3D fmt) + if (supported_formats[i].format =3D=3D fmt) return 1; return 0; } +static const char* get_format_name(enum AVPixelFormat fmt) +{ + int i; + + for (i =3D 0; i < FF_ARRAY_ELEMS(supported_formats); i++) + if (supported_formats[i].format =3D=3D fmt) + return supported_formats[i].name; + return NULL; +} + static av_cold void set_format_info(AVFilterContext *ctx, enum AVPixelForm= at in_format, enum AVPixelFormat out_format) { CUDAScaleContext *s =3D ctx->priv; @@ -284,8 +306,8 @@ static av_cold int cudascale_load_functions(AVFilterCon= text *ctx) char buf[128]; int ret; - const char *in_fmt_name =3D av_get_pix_fmt_name(s->in_fmt); - const char *out_fmt_name =3D av_get_pix_fmt_name(s->out_fmt); + const char *in_fmt_name =3D get_format_name(s->in_fmt); + const char *out_fmt_name =3D get_format_name(s->out_fmt); const char *function_infix =3D ""; @@ -335,11 +357,13 @@ static av_cold int cudascale_load_functions(AVFilterC= ontext *ctx) ret =3D AVERROR(ENOSYS); goto fail; } + av_log(ctx, AV_LOG_DEBUG, "Luma filter: %s (%s -> %s)\n", buf, av_get_= pix_fmt_name(s->in_fmt), av_get_pix_fmt_name(s->out_fmt)); snprintf(buf, sizeof(buf), "Subsample_%s_%s_%s_uv", function_infix, in= _fmt_name, out_fmt_name); ret =3D CHECK_CU(cu->cuModuleGetFunction(&s->cu_func_uv, s->cu_module,= buf)); if (ret < 0) goto fail; + av_log(ctx, AV_LOG_DEBUG, "Chroma filter: %s (%s -> %s)\n", buf, av_ge= t_pix_fmt_name(s->in_fmt), av_get_pix_fmt_name(s->out_fmt)); fail: CHECK_CU(cu->cuCtxPopCurrent(&dummy)); diff --git a/libavfilter/vf_scale_cuda.cu b/libavfilter/vf_scale_cuda.cu index 271b55cd5d..62d1649a25 100644 --- a/libavfilter/vf_scale_cuda.cu +++ b/libavfilter/vf_scale_cuda.cu @@ -40,6 +40,11 @@ static inline __device__ ushort conv_8to16(uchar in, ush= ort mask) return ((ushort)in | ((ushort)in << 8)) & mask; } +static inline __device__ ushort conv_8to10pl(uchar in) +{ + return ((ushort)in << 2) | ((ushort)in >> 6); +} + static inline __device__ uchar conv_16to8(ushort in) { return in >> 8; @@ -50,16 +55,31 @@ static inline __device__ uchar conv_10to8(ushort in) return in >> 8; } +static inline __device__ uchar conv_10to8pl(ushort in) +{ + return in >> 2; +} + static inline __device__ ushort conv_10to16(ushort in) { return in | (in >> 10); } +static inline __device__ ushort conv_10to16pl(ushort in) +{ + return (in << 6) | (in >> 4); +} + static inline __device__ ushort conv_16to10(ushort in) { return in & mask_10bit; } +static inline __device__ ushort conv_16to10pl(ushort in) +{ + return in >> 6; +} + #define DEF_F(N, T) \ template subsample_func_y, = \ subsample_function_t subsample_func_uv> = \ @@ -81,9 +101,9 @@ static inline __device__ ushort conv_16to10(ushort in) #define DEFAULT_DST(n) \ dst[n][yo*FIXED_PITCH+xo] -// yuv420p->X +// planar8->X -struct Convert_yuv420p_yuv420p +struct Convert_planar8_planar8 { static const int in_bit_depth =3D 8; typedef uchar in_T; @@ -103,71 +123,69 @@ struct Convert_yuv420p_yuv420p } }; -struct Convert_yuv420p_nv12 +struct Convert_planar8_planar10 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_8to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_uchar2( - SUB_F(uv, 1), - SUB_F(uv, 2) - ); + DEFAULT_DST(1) =3D conv_8to10pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_8to10pl(SUB_F(uv, 2)); } }; -struct Convert_yuv420p_yuv444p +struct Convert_planar8_planar16 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D SUB_F(uv, 1); - DEFAULT_DST(2) =3D SUB_F(uv, 2); + DEFAULT_DST(1) =3D conv_8to16(SUB_F(uv, 1), mask_16bit); + DEFAULT_DST(2) =3D conv_8to16(SUB_F(uv, 2), mask_16bit); } }; -struct Convert_yuv420p_p010le +struct Convert_planar8_semiplanar8 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_10bit), - conv_8to16(SUB_F(uv, 2), mask_10bit) + DEFAULT_DST(1) =3D make_uchar2( + SUB_F(uv, 1), + SUB_F(uv, 2) ); } }; -struct Convert_yuv420p_p016le +struct Convert_planar8_semiplanar10 { static const int in_bit_depth =3D 8; typedef uchar in_T; @@ -177,25 +195,25 @@ struct Convert_yuv420p_p016le DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); } DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_16bit), - conv_8to16(SUB_F(uv, 2), mask_16bit) + conv_8to16(SUB_F(uv, 1), mask_10bit), + conv_8to16(SUB_F(uv, 2), mask_10bit) ); } }; -struct Convert_yuv420p_yuv444p16le +struct Convert_planar8_semiplanar16 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { @@ -204,41 +222,44 @@ struct Convert_yuv420p_yuv444p16le DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_8to16(SUB_F(uv, 1), mask_16bit); - DEFAULT_DST(2) =3D conv_8to16(SUB_F(uv, 2), mask_16bit); + DEFAULT_DST(1) =3D make_ushort2( + conv_8to16(SUB_F(uv, 1), mask_16bit), + conv_8to16(SUB_F(uv, 2), mask_16bit) + ); } }; -// nv12->X -struct Convert_nv12_yuv420p + +// planar10->X + +struct Convert_planar10_planar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_10to8pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D res.x; - DEFAULT_DST(2) =3D res.y; + DEFAULT_DST(1) =3D conv_10to8pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_10to8pl(SUB_F(uv, 2)); } }; -struct Convert_nv12_nv12 +struct Convert_planar10_planar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { @@ -248,148 +269,145 @@ struct Convert_nv12_nv12 DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D SUB_F(uv, 1); + DEFAULT_DST(2) =3D SUB_F(uv, 2); } }; -struct Convert_nv12_yuv444p +struct Convert_planar10_planar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_10to16pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D res.x; - DEFAULT_DST(2) =3D res.y; + DEFAULT_DST(1) =3D conv_10to16pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_10to16pl(SUB_F(uv, 2)); } }; -struct Convert_nv12_p010le +struct Convert_planar10_semiplanar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); + DEFAULT_DST(0) =3D conv_10to8pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(res.x, mask_10bit), - conv_8to16(res.y, mask_10bit) + DEFAULT_DST(1) =3D make_uchar2( + conv_10to8pl(SUB_F(uv, 1)), + conv_10to8pl(SUB_F(uv, 2)) ); } }; -struct Convert_nv12_p016le +struct Convert_planar10_semiplanar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D (SUB_F(y, 0) << 6); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(res.x, mask_16bit), - conv_8to16(res.y, mask_16bit) + (SUB_F(uv, 1) << 6), + (SUB_F(uv, 2) << 6) ); } }; -struct Convert_nv12_yuv444p16le +struct Convert_planar10_semiplanar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D conv_10to16pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_8to16(res.x, mask_16bit); - DEFAULT_DST(2) =3D conv_8to16(res.y, mask_16bit); + DEFAULT_DST(1) =3D make_ushort2( + conv_10to16pl(SUB_F(uv, 1)), + conv_10to16pl(SUB_F(uv, 2)) + ); } }; -// yuv444p->X +// planar16->X -struct Convert_yuv444p_yuv420p +struct Convert_planar16_planar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D SUB_F(uv, 1); - DEFAULT_DST(2) =3D SUB_F(uv, 2); + DEFAULT_DST(1) =3D conv_16to8(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_16to8(SUB_F(uv, 2)); } }; -struct Convert_yuv444p_nv12 +struct Convert_planar16_planar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_16to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_uchar2( - SUB_F(uv, 1), - SUB_F(uv, 2) - ); + DEFAULT_DST(1) =3D conv_16to10pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_16to10pl(SUB_F(uv, 2)); } }; -struct Convert_yuv444p_yuv444p +struct Convert_planar16_planar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { @@ -403,144 +421,144 @@ struct Convert_yuv444p_yuv444p } }; -struct Convert_yuv444p_p010le +struct Convert_planar16_semiplanar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); + DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_10bit), - conv_8to16(SUB_F(uv, 2), mask_10bit) + DEFAULT_DST(1) =3D make_uchar2( + conv_16to8(SUB_F(uv, 1)), + conv_16to8(SUB_F(uv, 2)) ); } }; -struct Convert_yuv444p_p016le +struct Convert_planar16_semiplanar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_16bit), - conv_8to16(SUB_F(uv, 2), mask_16bit) + conv_16to10(SUB_F(uv, 1)), + conv_16to10(SUB_F(uv, 2)) ); } }; -struct Convert_yuv444p_yuv444p16le +struct Convert_planar16_semiplanar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_8to16(SUB_F(uv, 1), mask_16bit); - DEFAULT_DST(2) =3D conv_8to16(SUB_F(uv, 2), mask_16bit); + DEFAULT_DST(1) =3D make_ushort2( + SUB_F(uv, 1), + SUB_F(uv, 2) + ); } }; -// p010le->X +// semiplanar8->X -struct Convert_p010le_yuv420p +struct Convert_semiplanar8_planar8 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_10to8(res.x); - DEFAULT_DST(2) =3D conv_10to8(res.y); + DEFAULT_DST(1) =3D res.x; + DEFAULT_DST(2) =3D res.y; } }; -struct Convert_p010le_nv12 +struct Convert_semiplanar8_planar10 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_uchar2( - conv_10to8(res.x), - conv_10to8(res.y) - ); + DEFAULT_DST(1) =3D conv_8to10pl(res.x); + DEFAULT_DST(2) =3D conv_8to10pl(res.y); } }; -struct Convert_p010le_yuv444p +struct Convert_semiplanar8_planar16 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_10to8(res.x); - DEFAULT_DST(2) =3D conv_10to8(res.y); + DEFAULT_DST(1) =3D conv_8to16(res.x, mask_16bit); + DEFAULT_DST(2) =3D conv_8to16(res.y, mask_16bit); } }; -struct Convert_p010le_p010le +struct Convert_semiplanar8_semiplanar8 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { @@ -553,55 +571,57 @@ struct Convert_p010le_p010le } }; -struct Convert_p010le_p016le +struct Convert_semiplanar8_semiplanar10 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); DEFAULT_DST(1) =3D make_ushort2( - conv_10to16(res.x), - conv_10to16(res.y) + conv_8to16(res.x, mask_10bit), + conv_8to16(res.y, mask_10bit) ); } }; -struct Convert_p010le_yuv444p16le +struct Convert_semiplanar8_semiplanar16 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_10to16(res.x); - DEFAULT_DST(2) =3D conv_10to16(res.y); + DEFAULT_DST(1) =3D make_ushort2( + conv_8to16(res.x, mask_16bit), + conv_8to16(res.y, mask_16bit) + ); } }; -// p016le->X +// semiplanar10->X -struct Convert_p016le_yuv420p +struct Convert_semiplanar10_planar8 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; typedef uchar out_T; @@ -609,87 +629,85 @@ struct Convert_p016le_yuv420p DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_16to8(res.x); - DEFAULT_DST(2) =3D conv_16to8(res.y); + DEFAULT_DST(1) =3D conv_10to8(res.x); + DEFAULT_DST(2) =3D conv_10to8(res.y); } }; -struct Convert_p016le_nv12 +struct Convert_semiplanar10_planar10 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D SUB_F(y, 0) >> 6; } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_uchar2( - conv_16to8(res.x), - conv_16to8(res.y) - ); + DEFAULT_DST(1) =3D res.x >> 6; + DEFAULT_DST(2) =3D res.y >> 6; } }; -struct Convert_p016le_yuv444p +struct Convert_semiplanar10_planar16 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_16to8(res.x); - DEFAULT_DST(2) =3D conv_16to8(res.y); + DEFAULT_DST(1) =3D conv_10to16(res.x); + DEFAULT_DST(2) =3D conv_10to16(res.y); } }; -struct Convert_p016le_p010le +struct Convert_semiplanar10_semiplanar8 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_ushort2( - conv_16to10(res.x), - conv_16to10(res.y) + DEFAULT_DST(1) =3D make_uchar2( + conv_10to8(res.x), + conv_10to8(res.y) ); } }; -struct Convert_p016le_p016le +struct Convert_semiplanar10_semiplanar10 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; typedef ushort out_T; @@ -706,34 +724,37 @@ struct Convert_p016le_p016le } }; -struct Convert_p016le_yuv444p16le +struct Convert_semiplanar10_semiplanar16 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D res.x; - DEFAULT_DST(2) =3D res.y; + DEFAULT_DST(1) =3D make_ushort2( + conv_10to16(res.x), + conv_10to16(res.y) + ); } }; -// yuv444p16le->X -struct Convert_yuv444p16le_yuv420p +// semiplanar16->X + +struct Convert_semiplanar16_planar8 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; + typedef ushort2 in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; @@ -744,104 +765,107 @@ struct Convert_yuv444p16le_yuv420p DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_16to8(SUB_F(uv, 1)); - DEFAULT_DST(2) =3D conv_16to8(SUB_F(uv, 2)); + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D conv_16to8(res.x); + DEFAULT_DST(2) =3D conv_16to8(res.y); } }; -struct Convert_yuv444p16le_nv12 +struct Convert_semiplanar16_planar10 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + typedef ushort2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_16to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_uchar2( - conv_16to8(SUB_F(uv, 1)), - conv_16to8(SUB_F(uv, 2)) - ); + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D conv_16to10pl(res.x); + DEFAULT_DST(2) =3D conv_16to10pl(res.y); } }; -struct Convert_yuv444p16le_yuv444p +struct Convert_semiplanar16_planar16 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + typedef ushort2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_16to8(SUB_F(uv, 1)); - DEFAULT_DST(2) =3D conv_16to8(SUB_F(uv, 2)); + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D res.x; + DEFAULT_DST(2) =3D res.y; } }; -struct Convert_yuv444p16le_p010le +struct Convert_semiplanar16_semiplanar8 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + typedef ushort2 in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_ushort2( - conv_16to10(SUB_F(uv, 1)), - conv_16to10(SUB_F(uv, 2)) + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D make_uchar2( + conv_16to8(res.x), + conv_16to8(res.y) ); } }; -struct Convert_yuv444p16le_p016le +struct Convert_semiplanar16_semiplanar10 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; + typedef ushort2 in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { + in_T_uv res =3D SUB_F(uv, 1); DEFAULT_DST(1) =3D make_ushort2( - SUB_F(uv, 1), - SUB_F(uv, 2) + conv_16to10(res.x), + conv_16to10(res.y) ); } }; -struct Convert_yuv444p16le_yuv444p16le +struct Convert_semiplanar16_semiplanar16 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; + typedef ushort2 in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { @@ -851,7 +875,6 @@ struct Convert_yuv444p16le_yuv444p16le DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D SUB_F(uv, 1); - DEFAULT_DST(2) =3D SUB_F(uv, 2); } }; @@ -1184,12 +1207,12 @@ extern "C" { NEAREST_KERNEL(C,_uv) #define NEAREST_KERNELS(C) \ - NEAREST_KERNEL_RAW(yuv420p_ ## C) \ - NEAREST_KERNEL_RAW(nv12_ ## C) \ - NEAREST_KERNEL_RAW(yuv444p_ ## C) \ - NEAREST_KERNEL_RAW(p010le_ ## C) \ - NEAREST_KERNEL_RAW(p016le_ ## C) \ - NEAREST_KERNEL_RAW(yuv444p16le_ ## C) + NEAREST_KERNEL_RAW(planar8_ ## C) \ + NEAREST_KERNEL_RAW(planar10_ ## C) \ + NEAREST_KERNEL_RAW(planar16_ ## C) \ + NEAREST_KERNEL_RAW(semiplanar8_ ## C) \ + NEAREST_KERNEL_RAW(semiplanar10_ ## C) \ + NEAREST_KERNEL_RAW(semiplanar16_ ## C) #define NEAREST_KERNELS_RGB(C) \ NEAREST_KERNEL_RAW(rgb0_ ## C) \ @@ -1197,12 +1220,12 @@ extern "C" { NEAREST_KERNEL_RAW(rgba_ ## C) \ NEAREST_KERNEL_RAW(bgra_ ## C) \ -NEAREST_KERNELS(yuv420p) -NEAREST_KERNELS(nv12) -NEAREST_KERNELS(yuv444p) -NEAREST_KERNELS(p010le) -NEAREST_KERNELS(p016le) -NEAREST_KERNELS(yuv444p16le) +NEAREST_KERNELS(planar8) +NEAREST_KERNELS(planar10) +NEAREST_KERNELS(planar16) +NEAREST_KERNELS(semiplanar8) +NEAREST_KERNELS(semiplanar10) +NEAREST_KERNELS(semiplanar16) NEAREST_KERNELS_RGB(rgb0) NEAREST_KERNELS_RGB(bgr0) @@ -1224,12 +1247,12 @@ NEAREST_KERNELS_RGB(bgra) BILINEAR_KERNEL(C,_uv) #define BILINEAR_KERNELS(C) \ - BILINEAR_KERNEL_RAW(yuv420p_ ## C) \ - BILINEAR_KERNEL_RAW(nv12_ ## C) \ - BILINEAR_KERNEL_RAW(yuv444p_ ## C) \ - BILINEAR_KERNEL_RAW(p010le_ ## C) \ - BILINEAR_KERNEL_RAW(p016le_ ## C) \ - BILINEAR_KERNEL_RAW(yuv444p16le_ ## C) + BILINEAR_KERNEL_RAW(planar8_ ## C) \ + BILINEAR_KERNEL_RAW(planar10_ ## C) \ + BILINEAR_KERNEL_RAW(planar16_ ## C) \ + BILINEAR_KERNEL_RAW(semiplanar8_ ## C) \ + BILINEAR_KERNEL_RAW(semiplanar10_ ## C) \ + BILINEAR_KERNEL_RAW(semiplanar16_ ## C) #define BILINEAR_KERNELS_RGB(C) \ BILINEAR_KERNEL_RAW(rgb0_ ## C) \ @@ -1237,12 +1260,12 @@ NEAREST_KERNELS_RGB(bgra) BILINEAR_KERNEL_RAW(rgba_ ## C) \ BILINEAR_KERNEL_RAW(bgra_ ## C) -BILINEAR_KERNELS(yuv420p) -BILINEAR_KERNELS(nv12) -BILINEAR_KERNELS(yuv444p) -BILINEAR_KERNELS(p010le) -BILINEAR_KERNELS(p016le) -BILINEAR_KERNELS(yuv444p16le) +BILINEAR_KERNELS(planar8) +BILINEAR_KERNELS(planar10) +BILINEAR_KERNELS(planar16) +BILINEAR_KERNELS(semiplanar8) +BILINEAR_KERNELS(semiplanar10) +BILINEAR_KERNELS(semiplanar16) BILINEAR_KERNELS_RGB(rgb0) BILINEAR_KERNELS_RGB(bgr0) @@ -1264,12 +1287,12 @@ BILINEAR_KERNELS_RGB(bgra) BICUBIC_KERNEL(C,_uv) #define BICUBIC_KERNELS(C) \ - BICUBIC_KERNEL_RAW(yuv420p_ ## C) \ - BICUBIC_KERNEL_RAW(nv12_ ## C) \ - BICUBIC_KERNEL_RAW(yuv444p_ ## C) \ - BICUBIC_KERNEL_RAW(p010le_ ## C) \ - BICUBIC_KERNEL_RAW(p016le_ ## C) \ - BICUBIC_KERNEL_RAW(yuv444p16le_ ## C) + BICUBIC_KERNEL_RAW(planar8_ ## C) \ + BICUBIC_KERNEL_RAW(planar10_ ## C) \ + BICUBIC_KERNEL_RAW(planar16_ ## C) \ + BICUBIC_KERNEL_RAW(semiplanar8_ ## C) \ + BICUBIC_KERNEL_RAW(semiplanar10_ ## C) \ + BICUBIC_KERNEL_RAW(semiplanar16_ ## C) #define BICUBIC_KERNELS_RGB(C) \ BICUBIC_KERNEL_RAW(rgb0_ ## C) \ @@ -1277,12 +1300,12 @@ BILINEAR_KERNELS_RGB(bgra) BICUBIC_KERNEL_RAW(rgba_ ## C) \ BICUBIC_KERNEL_RAW(bgra_ ## C) -BICUBIC_KERNELS(yuv420p) -BICUBIC_KERNELS(nv12) -BICUBIC_KERNELS(yuv444p) -BICUBIC_KERNELS(p010le) -BICUBIC_KERNELS(p016le) -BICUBIC_KERNELS(yuv444p16le) +BICUBIC_KERNELS(planar8) +BICUBIC_KERNELS(planar10) +BICUBIC_KERNELS(planar16) +BICUBIC_KERNELS(semiplanar8) +BICUBIC_KERNELS(semiplanar10) +BICUBIC_KERNELS(semiplanar16) BICUBIC_KERNELS_RGB(rgb0) BICUBIC_KERNELS_RGB(bgr0) @@ -1304,12 +1327,12 @@ BICUBIC_KERNELS_RGB(bgra) LANCZOS_KERNEL(C,_uv) #define LANCZOS_KERNELS(C) \ - LANCZOS_KERNEL_RAW(yuv420p_ ## C) \ - LANCZOS_KERNEL_RAW(nv12_ ## C) \ - LANCZOS_KERNEL_RAW(yuv444p_ ## C) \ - LANCZOS_KERNEL_RAW(p010le_ ## C) \ - LANCZOS_KERNEL_RAW(p016le_ ## C) \ - LANCZOS_KERNEL_RAW(yuv444p16le_ ## C) + LANCZOS_KERNEL_RAW(planar8_ ## C) \ + LANCZOS_KERNEL_RAW(planar10_ ## C) \ + LANCZOS_KERNEL_RAW(planar16_ ## C) \ + LANCZOS_KERNEL_RAW(semiplanar8_ ## C) \ + LANCZOS_KERNEL_RAW(semiplanar10_ ## C) \ + LANCZOS_KERNEL_RAW(semiplanar16_ ## C) #define LANCZOS_KERNELS_RGB(C) \ LANCZOS_KERNEL_RAW(rgb0_ ## C) \ @@ -1317,12 +1340,12 @@ BICUBIC_KERNELS_RGB(bgra) LANCZOS_KERNEL_RAW(rgba_ ## C) \ LANCZOS_KERNEL_RAW(bgra_ ## C) -LANCZOS_KERNELS(yuv420p) -LANCZOS_KERNELS(nv12) -LANCZOS_KERNELS(yuv444p) -LANCZOS_KERNELS(p010le) -LANCZOS_KERNELS(p016le) -LANCZOS_KERNELS(yuv444p16le) +LANCZOS_KERNELS(planar8) +LANCZOS_KERNELS(planar10) +LANCZOS_KERNELS(planar16) +LANCZOS_KERNELS(semiplanar8) +LANCZOS_KERNELS(semiplanar10) +LANCZOS_KERNELS(semiplanar16) LANCZOS_KERNELS_RGB(rgb0) LANCZOS_KERNELS_RGB(bgr0) -- 2.39.5 (Apple Git-154) ---------------------------------------------------------------------------= -------- NVIDIA GmbH Wuerselen Amtsgericht Aachen HRB 8361 Managing Directors: Rebecca Peters, Donald Robertson, Janet Hall, Ludwig vo= n Reiche ---------------------------------------------------------------------------= -------- This email message is for the sole use of the intended recipient(s) and may= contain confidential information. Any unauthorized review, use, disclosure or dist= ribution is prohibited. If you are not the intended recipient, please contact the s= ender by reply email and destroy all copies of the original message. ---------------------------------------------------------------------------= -------- --===============6034346265839831920== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============6034346265839831920==--