From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 90DEB4FE6C for ; Thu, 3 Jul 2025 14:27:28 +0000 (UTC) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 8F37268EBE0; Thu, 3 Jul 2025 17:27:12 +0300 (EEST) To: Date: Thu, 3 Jul 2025 16:25:20 +0200 In-Reply-To: <20250703142520.16586-1-ddesouza@nvidia.com> References: <20250703142520.16586-1-ddesouza@nvidia.com> MIME-Version: 1.0 Message-ID: List-Id: FFmpeg development discussions and patches List-Post: From: Diego Felix de Souza via ffmpeg-devel Precedence: list Cc: ddesouza@nvidia.com X-Mailman-Version: 2.1.29 X-BeenThere: ffmpeg-devel@ffmpeg.org List-Subscribe: , List-Unsubscribe: , List-Archive: Reply-To: FFmpeg development discussions and patches List-Help: Subject: [FFmpeg-devel] [PATCH 3/3] avfilter/scale_cuda: Add support for 4:2:2 chroma subsampling Content-Type: multipart/mixed; boundary="===============6439331732457473877==" Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Archived-At: List-Archive: List-Post: --===============6439331732457473877== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2045.outbound.protection.outlook.com [40.107.101.45]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id 7EE0668EBBC for ; Thu, 3 Jul 2025 17:27:05 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=S/I4thCBUE7de+708rZRp1BjG9vzEuQm2uSnAh3jvnNP2Y14sskcEXrcsyGzUR94MZeAHp5orLmt6NKWP/aHoXHzuquhyMmUJK/eEutCW2WC//aW/7KwwoYbK5yXHQcGHL0BajACtTZH4/SyRKYnOzbx1RjjS5MmWRqQQ650/n49F86EM2GE2asB7Vz1pgmfmfy+PSHqijiytSsqdBX6lI3PZ7CXPPUtUuCt5vvxsyT61QIVRXk1AJr2opzV6DW58M4jswNYaYogHv116FMOtRJh5Lz9Ja67Xiw/uPxTgGKLOobufeql14H2+g+pfJXb1WL5czwoLYcPrLvl/HpqRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4HzhdswX43wiMsVJ5izIkvUd/Wcfm5a7RAgLkLFvyro=; b=ci8UhXy34/N+52vU7jYcxAsx0XhbtdP5DbmMUlLelHJY6ufeQO/o70QLdgcUP3QN0lefudUvxNRNGXkejq3KhyeaIKgxwaojnlJVyaXb5RjOSbopEsFWAbzmXi3yrOR0lRNIHzcXjXsnPTGWNoaFGxhDVb/HOxMxu+iUAjZeLkIb8YoCgxFgg3zM0LuEXscJEu+86Zsk6312tvCih2SxbB6JSLap4oGW3Su5rf1IniNQdbCbdnyULNVs/Y522e+RoL93MJm5HhH8boGuFFXCJ6mUAQMkKNC1X5QMZ5EqwlvI+uQaMZCzMceo6QuQcKJR5ziE/3AWQ/D49vmb23iY/Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=ffmpeg.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4HzhdswX43wiMsVJ5izIkvUd/Wcfm5a7RAgLkLFvyro=; b=rAMeFghZldlsQE7ZwmftKejHtt333agzbebmXdzE31TPboNDAsPjTegOO0vwoxwIKpG3yyYqEe53q5qlZ3MsDE839hBp7X24IPOOIV1LdxEELWUXjTpDtGWfI7hL7/hOFnDp0YiKMC9d7J2KLyj9iAx5TpSXi+3Mak0ezUFawp1OEbNNvyUr/LvYHkNpqTXCdV9rOFLtpMMzDnpbIRHuoqllaFXlLd6to22IDFfqCCO/MFWW2aVBhlMxk9AJgEQ2BmvEZSBK6MdWhPiip+37O4vmMdd/Z8GWyjtAE+v1Tbt88wC5lU85ZFTA4Lt5v/iFWSlfWf2Y70+aomJhbNrZYg== Received: from SJ0PR13CA0016.namprd13.prod.outlook.com (2603:10b6:a03:2c0::21) by SA1PR12MB7294.namprd12.prod.outlook.com (2603:10b6:806:2b8::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8880.29; Thu, 3 Jul 2025 14:26:55 +0000 Received: from SJ1PEPF00002315.namprd03.prod.outlook.com (2603:10b6:a03:2c0:cafe::f9) by SJ0PR13CA0016.outlook.office365.com (2603:10b6:a03:2c0::21) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8922.14 via Frontend Transport; Thu, 3 Jul 2025 14:26:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by SJ1PEPF00002315.mail.protection.outlook.com (10.167.242.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8901.15 via Frontend Transport; Thu, 3 Jul 2025 14:26:54 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 3 Jul 2025 07:26:41 -0700 Received: from nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Thu, 3 Jul 2025 07:26:40 -0700 From: To: CC: Diego de Souza Subject: [PATCH 3/3] avfilter/scale_cuda: Add support for 4:2:2 chroma subsampling Date: Thu, 3 Jul 2025 16:25:20 +0200 Message-ID: <20250703142520.16586-3-ddesouza@nvidia.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250703142520.16586-1-ddesouza@nvidia.com> References: <20250703142520.16586-1-ddesouza@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail203.nvidia.com (10.129.68.9) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00002315:EE_|SA1PR12MB7294:EE_ X-MS-Office365-Filtering-Correlation-Id: 5375b639-a2d0-4cea-8c5b-08ddba3da8c6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|376014|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?1zLHCtExFKnQt/yTWzlDrRe1Tn6KjhmKy3Hxws9NsXnk5MYv3EqS3sSY9EVH?= =?us-ascii?Q?hhevixuujOFTozrCe/PydrMw4W99TZ/j85s8w+m7Qx2u8uNo2g3Gde9TAxCv?= =?us-ascii?Q?YkoqNSpjC+XZNK8myTuJNIs8WCcDQXCnfBn9xn8fxg4GJdR1u/wI9HIWI8pF?= =?us-ascii?Q?jB7fa6B7v58haQjKE3ApwVYcx9vaUHQ7c+XZ8Px2gu9BDlo1nXIoPdqUBBpR?= =?us-ascii?Q?fC9j9niEje4bMZ14GkpG4ozV7yt3HejUtNjnZVyxKP94W0w4y3Sd7tZodXt+?= =?us-ascii?Q?m/qp57jCwllNU336OKltecawdULnv58Ud3w0RKmS8/j3qOk7NZITCpSG0ML6?= =?us-ascii?Q?H7w0aUUmfg+SXVRvOGsur3kb8pnwxUqocwk/a/P+mkvS0lwkkf00qtlGdw2z?= =?us-ascii?Q?UU59zqlXPlK51Fy/KBL0z/XS18vpXJSLhqtAwCGUAr0Q49Ly82wBjEzUyhju?= =?us-ascii?Q?KMt0WIl3vHD4A9h+sYkUSLF7zqFqx5j3w6EkA7rwH6zS7IdXYQn7yhFWa5Fa?= =?us-ascii?Q?Iu2KAq5cQCsvMCIyGIUYO7Ks0vT5XhuAD1eCj1VTXvZnhqpyhNLh2IV5t8lc?= =?us-ascii?Q?FtXsSVvSGjudMN9rVJrTPp5EONy+GAAFYluGLHg+JLAxz+dGRP+0ztPj0AjO?= =?us-ascii?Q?noBKfE1Vb5Wyru60xN0x6PLwY10YfNtyMAYIDAq8vfiPa2hwN8Ct/EemOR1G?= =?us-ascii?Q?thhxvESiAI9ZbyfJFtvN7wkGN3SdAyS4rmrPGnfK2o2zRRokRIHaAAX6Aq15?= =?us-ascii?Q?dm/NqVGip2rAmpFP5wc9cqxOPLYCIVWIeppc7fCk8XK+eg1u6jvl98gA63HT?= =?us-ascii?Q?7u6UfG7hMh/dsSv1af1IZ1tWbZXttpR9q1L+fTy8V2xjXMQPLJSjhDDhWKvU?= =?us-ascii?Q?zDZTzuFnwcPi0ZeEJMvg41u0OGv/baKkRg1gXGVU07QI5HSkXG57DwqD3Xyc?= =?us-ascii?Q?R+JPRaxuUxTQR92eRAzdxIMSYE88XKPUDXh811xdHJLSmljm4eKoBhkro79N?= =?us-ascii?Q?Fw/DWgDEVeqZWROaqzrozVRbzamLDv6NrmC1idfZlIYe3W0IZO5WmOJjuaiq?= =?us-ascii?Q?9c7pPsWWZIy5cDwb0vcxh+MU3mq5bn9iLhAWyuOPalYVGHwpaQ8Gh/suwyts?= =?us-ascii?Q?+DnX5LOS4OI+Bi+iSyYdgN5zPWZ0wBV2OV44uiHrv0uXzVfhLpk3K9WJuwa9?= =?us-ascii?Q?nmbOEgH6lW9KiDIPBC4s0lyOqtKhNIqFOdWnly18MLC6ZZ+je69GyzPvyM9G?= =?us-ascii?Q?F7iKFmS4CyETU12C92Qkvb0Xp5XHiZpcuy3qqyZwyzACOMrEYWxhdjas+ygF?= =?us-ascii?Q?H6RJV5+ecBjqWxRE8S6viwkxqv7KiQVKqQmyPBAzuy6SpYuVgNQavv6aD6H1?= =?us-ascii?Q?zxnAdPiePx/JpgEx++8xaBwlvIJ6koicpQTlL+u9f7wwdQSOs9BAac5GWyGk?= =?us-ascii?Q?E6fw6mgf8MJGIoJHdDtAt7dMowmxWvnCMzso/VxfkGU5dL9dZ776gBhXDxUX?= =?us-ascii?Q?rMXRRBJoZtoW/1fx8JlDeUaRwWF3NpJiliSI?= X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(376014)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Jul 2025 14:26:54.3395 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5375b639-a2d0-4cea-8c5b-08ddba3da8c6 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00002315.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB7294 From: Diego de Souza The supported YUV pixel formats were separated between planar and semiplanar. This approach reduces the number of CUDA kernels for all pixel formats. This patch: 1. Adds support for YUV 4:2:2 planar and semi-planar formats: yuv422p, yuv422p10, nv16, p210, p216 2. Implements new conversion structures and kernel definitions for planar and semi-planar formats Signed-off-by: Diego de Souza --- libavfilter/vf_scale_cuda.c | 52 ++- libavfilter/vf_scale_cuda.cu | 637 ++++++++++++++++++----------------- 2 files changed, 368 insertions(+), 321 deletions(-) diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c index 44eef207ca..560f901730 100644 --- a/libavfilter/vf_scale_cuda.c +++ b/libavfilter/vf_scale_cuda.c @@ -39,17 +39,29 @@ #include "cuda/load_helper.h" #include "vf_scale_cuda.h" -static const enum AVPixelFormat supported_formats[] =3D { - AV_PIX_FMT_YUV420P, - AV_PIX_FMT_NV12, - AV_PIX_FMT_YUV444P, - AV_PIX_FMT_P010, - AV_PIX_FMT_P016, - AV_PIX_FMT_YUV444P16, - AV_PIX_FMT_0RGB32, - AV_PIX_FMT_0BGR32, - AV_PIX_FMT_RGB32, - AV_PIX_FMT_BGR32, +struct format_entry { + enum AVPixelFormat format; + const char *name; +}; + +static const struct format_entry supported_formats[] =3D { + {AV_PIX_FMT_YUV420P, "planar8"}, + {AV_PIX_FMT_YUV422P, "planar8"}, + {AV_PIX_FMT_YUV444P, "planar8"}, + {AV_PIX_FMT_YUV420P10,"planar10"}, + {AV_PIX_FMT_YUV422P10,"planar10"}, + {AV_PIX_FMT_YUV444P10,"planar10"}, + {AV_PIX_FMT_YUV444P16,"planar16"}, + {AV_PIX_FMT_NV12, "semiplanar8"}, + {AV_PIX_FMT_NV16, "semiplanar8"}, + {AV_PIX_FMT_P010, "semiplanar10"}, + {AV_PIX_FMT_P210, "semiplanar10"}, + {AV_PIX_FMT_P016, "semiplanar16"}, + {AV_PIX_FMT_P216, "semiplanar16"}, + {AV_PIX_FMT_0RGB32, "bgr0"}, + {AV_PIX_FMT_0BGR32, "rgb0"}, + {AV_PIX_FMT_RGB32, "bgra"}, + {AV_PIX_FMT_BGR32, "rgba"}, }; #define DIV_UP(a, b) ( ((a) + (b) - 1) / (b) ) @@ -187,11 +199,21 @@ static int format_is_supported(enum AVPixelFormat fmt= ) int i; for (i =3D 0; i < FF_ARRAY_ELEMS(supported_formats); i++) - if (supported_formats[i] =3D=3D fmt) + if (supported_formats[i].format =3D=3D fmt) return 1; return 0; } +static const char* get_format_name(enum AVPixelFormat fmt) +{ + int i; + + for (i =3D 0; i < FF_ARRAY_ELEMS(supported_formats); i++) + if (supported_formats[i].format =3D=3D fmt) + return supported_formats[i].name; + return NULL; +} + static av_cold void set_format_info(AVFilterContext *ctx, enum AVPixelForm= at in_format, enum AVPixelFormat out_format) { CUDAScaleContext *s =3D ctx->priv; @@ -284,8 +306,8 @@ static av_cold int cudascale_load_functions(AVFilterCon= text *ctx) char buf[128]; int ret; - const char *in_fmt_name =3D av_get_pix_fmt_name(s->in_fmt); - const char *out_fmt_name =3D av_get_pix_fmt_name(s->out_fmt); + const char *in_fmt_name =3D get_format_name(s->in_fmt); + const char *out_fmt_name =3D get_format_name(s->out_fmt); const char *function_infix =3D ""; @@ -335,11 +357,13 @@ static av_cold int cudascale_load_functions(AVFilterC= ontext *ctx) ret =3D AVERROR(ENOSYS); goto fail; } + av_log(ctx, AV_LOG_DEBUG, "Luma filter: %s (%s -> %s)\n", buf, av_get_= pix_fmt_name(s->in_fmt), av_get_pix_fmt_name(s->out_fmt)); snprintf(buf, sizeof(buf), "Subsample_%s_%s_%s_uv", function_infix, in= _fmt_name, out_fmt_name); ret =3D CHECK_CU(cu->cuModuleGetFunction(&s->cu_func_uv, s->cu_module,= buf)); if (ret < 0) goto fail; + av_log(ctx, AV_LOG_DEBUG, "Chroma filter: %s (%s -> %s)\n", buf, av_ge= t_pix_fmt_name(s->in_fmt), av_get_pix_fmt_name(s->out_fmt)); fail: CHECK_CU(cu->cuCtxPopCurrent(&dummy)); diff --git a/libavfilter/vf_scale_cuda.cu b/libavfilter/vf_scale_cuda.cu index 271b55cd5d..62d1649a25 100644 --- a/libavfilter/vf_scale_cuda.cu +++ b/libavfilter/vf_scale_cuda.cu @@ -40,6 +40,11 @@ static inline __device__ ushort conv_8to16(uchar in, ush= ort mask) return ((ushort)in | ((ushort)in << 8)) & mask; } +static inline __device__ ushort conv_8to10pl(uchar in) +{ + return ((ushort)in << 2) | ((ushort)in >> 6); +} + static inline __device__ uchar conv_16to8(ushort in) { return in >> 8; @@ -50,16 +55,31 @@ static inline __device__ uchar conv_10to8(ushort in) return in >> 8; } +static inline __device__ uchar conv_10to8pl(ushort in) +{ + return in >> 2; +} + static inline __device__ ushort conv_10to16(ushort in) { return in | (in >> 10); } +static inline __device__ ushort conv_10to16pl(ushort in) +{ + return (in << 6) | (in >> 4); +} + static inline __device__ ushort conv_16to10(ushort in) { return in & mask_10bit; } +static inline __device__ ushort conv_16to10pl(ushort in) +{ + return in >> 6; +} + #define DEF_F(N, T) \ template subsample_func_y, = \ subsample_function_t subsample_func_uv> = \ @@ -81,9 +101,9 @@ static inline __device__ ushort conv_16to10(ushort in) #define DEFAULT_DST(n) \ dst[n][yo*FIXED_PITCH+xo] -// yuv420p->X +// planar8->X -struct Convert_yuv420p_yuv420p +struct Convert_planar8_planar8 { static const int in_bit_depth =3D 8; typedef uchar in_T; @@ -103,71 +123,69 @@ struct Convert_yuv420p_yuv420p } }; -struct Convert_yuv420p_nv12 +struct Convert_planar8_planar10 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_8to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_uchar2( - SUB_F(uv, 1), - SUB_F(uv, 2) - ); + DEFAULT_DST(1) =3D conv_8to10pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_8to10pl(SUB_F(uv, 2)); } }; -struct Convert_yuv420p_yuv444p +struct Convert_planar8_planar16 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D SUB_F(uv, 1); - DEFAULT_DST(2) =3D SUB_F(uv, 2); + DEFAULT_DST(1) =3D conv_8to16(SUB_F(uv, 1), mask_16bit); + DEFAULT_DST(2) =3D conv_8to16(SUB_F(uv, 2), mask_16bit); } }; -struct Convert_yuv420p_p010le +struct Convert_planar8_semiplanar8 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_10bit), - conv_8to16(SUB_F(uv, 2), mask_10bit) + DEFAULT_DST(1) =3D make_uchar2( + SUB_F(uv, 1), + SUB_F(uv, 2) ); } }; -struct Convert_yuv420p_p016le +struct Convert_planar8_semiplanar10 { static const int in_bit_depth =3D 8; typedef uchar in_T; @@ -177,25 +195,25 @@ struct Convert_yuv420p_p016le DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); } DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_16bit), - conv_8to16(SUB_F(uv, 2), mask_16bit) + conv_8to16(SUB_F(uv, 1), mask_10bit), + conv_8to16(SUB_F(uv, 2), mask_10bit) ); } }; -struct Convert_yuv420p_yuv444p16le +struct Convert_planar8_semiplanar16 { static const int in_bit_depth =3D 8; typedef uchar in_T; typedef uchar in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { @@ -204,41 +222,44 @@ struct Convert_yuv420p_yuv444p16le DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_8to16(SUB_F(uv, 1), mask_16bit); - DEFAULT_DST(2) =3D conv_8to16(SUB_F(uv, 2), mask_16bit); + DEFAULT_DST(1) =3D make_ushort2( + conv_8to16(SUB_F(uv, 1), mask_16bit), + conv_8to16(SUB_F(uv, 2), mask_16bit) + ); } }; -// nv12->X -struct Convert_nv12_yuv420p + +// planar10->X + +struct Convert_planar10_planar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_10to8pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D res.x; - DEFAULT_DST(2) =3D res.y; + DEFAULT_DST(1) =3D conv_10to8pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_10to8pl(SUB_F(uv, 2)); } }; -struct Convert_nv12_nv12 +struct Convert_planar10_planar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { @@ -248,148 +269,145 @@ struct Convert_nv12_nv12 DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D SUB_F(uv, 1); + DEFAULT_DST(2) =3D SUB_F(uv, 2); } }; -struct Convert_nv12_yuv444p +struct Convert_planar10_planar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_10to16pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D res.x; - DEFAULT_DST(2) =3D res.y; + DEFAULT_DST(1) =3D conv_10to16pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_10to16pl(SUB_F(uv, 2)); } }; -struct Convert_nv12_p010le +struct Convert_planar10_semiplanar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); + DEFAULT_DST(0) =3D conv_10to8pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(res.x, mask_10bit), - conv_8to16(res.y, mask_10bit) + DEFAULT_DST(1) =3D make_uchar2( + conv_10to8pl(SUB_F(uv, 1)), + conv_10to8pl(SUB_F(uv, 2)) ); } }; -struct Convert_nv12_p016le +struct Convert_planar10_semiplanar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D (SUB_F(y, 0) << 6); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(res.x, mask_16bit), - conv_8to16(res.y, mask_16bit) + (SUB_F(uv, 1) << 6), + (SUB_F(uv, 2) << 6) ); } }; -struct Convert_nv12_yuv444p16le +struct Convert_planar10_semiplanar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar2 in_T_uv; + static const int in_bit_depth =3D 10; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D conv_10to16pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_8to16(res.x, mask_16bit); - DEFAULT_DST(2) =3D conv_8to16(res.y, mask_16bit); + DEFAULT_DST(1) =3D make_ushort2( + conv_10to16pl(SUB_F(uv, 1)), + conv_10to16pl(SUB_F(uv, 2)) + ); } }; -// yuv444p->X +// planar16->X -struct Convert_yuv444p_yuv420p +struct Convert_planar16_planar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D SUB_F(uv, 1); - DEFAULT_DST(2) =3D SUB_F(uv, 2); + DEFAULT_DST(1) =3D conv_16to8(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_16to8(SUB_F(uv, 2)); } }; -struct Convert_yuv444p_nv12 +struct Convert_planar16_planar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_16to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_uchar2( - SUB_F(uv, 1), - SUB_F(uv, 2) - ); + DEFAULT_DST(1) =3D conv_16to10pl(SUB_F(uv, 1)); + DEFAULT_DST(2) =3D conv_16to10pl(SUB_F(uv, 2)); } }; -struct Convert_yuv444p_yuv444p +struct Convert_planar16_planar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { @@ -403,144 +421,144 @@ struct Convert_yuv444p_yuv444p } }; -struct Convert_yuv444p_p010le +struct Convert_planar16_semiplanar8 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); + DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_10bit), - conv_8to16(SUB_F(uv, 2), mask_10bit) + DEFAULT_DST(1) =3D make_uchar2( + conv_16to8(SUB_F(uv, 1)), + conv_16to8(SUB_F(uv, 2)) ); } }; -struct Convert_yuv444p_p016le +struct Convert_planar16_semiplanar10 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D make_ushort2( - conv_8to16(SUB_F(uv, 1), mask_16bit), - conv_8to16(SUB_F(uv, 2), mask_16bit) + conv_16to10(SUB_F(uv, 1)), + conv_16to10(SUB_F(uv, 2)) ); } }; -struct Convert_yuv444p_yuv444p16le +struct Convert_planar16_semiplanar16 { - static const int in_bit_depth =3D 8; - typedef uchar in_T; - typedef uchar in_T_uv; + static const int in_bit_depth =3D 16; + typedef ushort in_T; + typedef ushort in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_8to16(SUB_F(uv, 1), mask_16bit); - DEFAULT_DST(2) =3D conv_8to16(SUB_F(uv, 2), mask_16bit); + DEFAULT_DST(1) =3D make_ushort2( + SUB_F(uv, 1), + SUB_F(uv, 2) + ); } }; -// p010le->X +// semiplanar8->X -struct Convert_p010le_yuv420p +struct Convert_semiplanar8_planar8 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_10to8(res.x); - DEFAULT_DST(2) =3D conv_10to8(res.y); + DEFAULT_DST(1) =3D res.x; + DEFAULT_DST(2) =3D res.y; } }; -struct Convert_p010le_nv12 +struct Convert_semiplanar8_planar10 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_uchar2( - conv_10to8(res.x), - conv_10to8(res.y) - ); + DEFAULT_DST(1) =3D conv_8to10pl(res.x); + DEFAULT_DST(2) =3D conv_8to10pl(res.y); } }; -struct Convert_p010le_yuv444p +struct Convert_semiplanar8_planar16 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_10to8(res.x); - DEFAULT_DST(2) =3D conv_10to8(res.y); + DEFAULT_DST(1) =3D conv_8to16(res.x, mask_16bit); + DEFAULT_DST(2) =3D conv_8to16(res.y, mask_16bit); } }; -struct Convert_p010le_p010le +struct Convert_semiplanar8_semiplanar8 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { @@ -553,55 +571,57 @@ struct Convert_p010le_p010le } }; -struct Convert_p010le_p016le +struct Convert_semiplanar8_semiplanar10 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_10bit); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); DEFAULT_DST(1) =3D make_ushort2( - conv_10to16(res.x), - conv_10to16(res.y) + conv_8to16(res.x, mask_10bit), + conv_8to16(res.y, mask_10bit) ); } }; -struct Convert_p010le_yuv444p16le +struct Convert_semiplanar8_semiplanar16 { - static const int in_bit_depth =3D 10; - typedef ushort in_T; - typedef ushort2 in_T_uv; + static const int in_bit_depth =3D 8; + typedef uchar in_T; + typedef uchar2 in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_8to16(SUB_F(y, 0), mask_16bit); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_10to16(res.x); - DEFAULT_DST(2) =3D conv_10to16(res.y); + DEFAULT_DST(1) =3D make_ushort2( + conv_8to16(res.x, mask_16bit), + conv_8to16(res.y, mask_16bit) + ); } }; -// p016le->X +// semiplanar10->X -struct Convert_p016le_yuv420p +struct Convert_semiplanar10_planar8 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; typedef uchar out_T; @@ -609,87 +629,85 @@ struct Convert_p016le_yuv420p DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_16to8(res.x); - DEFAULT_DST(2) =3D conv_16to8(res.y); + DEFAULT_DST(1) =3D conv_10to8(res.x); + DEFAULT_DST(2) =3D conv_10to8(res.y); } }; -struct Convert_p016le_nv12 +struct Convert_semiplanar10_planar10 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D SUB_F(y, 0) >> 6; } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_uchar2( - conv_16to8(res.x), - conv_16to8(res.y) - ); + DEFAULT_DST(1) =3D res.x >> 6; + DEFAULT_DST(2) =3D res.y >> 6; } }; -struct Convert_p016le_yuv444p +struct Convert_semiplanar10_planar16 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D conv_16to8(res.x); - DEFAULT_DST(2) =3D conv_16to8(res.y); + DEFAULT_DST(1) =3D conv_10to16(res.x); + DEFAULT_DST(2) =3D conv_10to16(res.y); } }; -struct Convert_p016le_p010le +struct Convert_semiplanar10_semiplanar8 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_10to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D make_ushort2( - conv_16to10(res.x), - conv_16to10(res.y) + DEFAULT_DST(1) =3D make_uchar2( + conv_10to8(res.x), + conv_10to8(res.y) ); } }; -struct Convert_p016le_p016le +struct Convert_semiplanar10_semiplanar10 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; typedef ushort out_T; @@ -706,34 +724,37 @@ struct Convert_p016le_p016le } }; -struct Convert_p016le_yuv444p16le +struct Convert_semiplanar10_semiplanar16 { - static const int in_bit_depth =3D 16; + static const int in_bit_depth =3D 10; typedef ushort in_T; typedef ushort2 in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_10to16(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { in_T_uv res =3D SUB_F(uv, 1); - DEFAULT_DST(1) =3D res.x; - DEFAULT_DST(2) =3D res.y; + DEFAULT_DST(1) =3D make_ushort2( + conv_10to16(res.x), + conv_10to16(res.y) + ); } }; -// yuv444p16le->X -struct Convert_yuv444p16le_yuv420p +// semiplanar16->X + +struct Convert_semiplanar16_planar8 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; + typedef ushort2 in_T_uv; typedef uchar out_T; typedef uchar out_T_uv; @@ -744,104 +765,107 @@ struct Convert_yuv444p16le_yuv420p DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_16to8(SUB_F(uv, 1)); - DEFAULT_DST(2) =3D conv_16to8(SUB_F(uv, 2)); + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D conv_16to8(res.x); + DEFAULT_DST(2) =3D conv_16to8(res.y); } }; -struct Convert_yuv444p16le_nv12 +struct Convert_semiplanar16_planar10 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; - typedef uchar out_T; - typedef uchar2 out_T_uv; + typedef ushort2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_16to10pl(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_uchar2( - conv_16to8(SUB_F(uv, 1)), - conv_16to8(SUB_F(uv, 2)) - ); + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D conv_16to10pl(res.x); + DEFAULT_DST(2) =3D conv_16to10pl(res.y); } }; -struct Convert_yuv444p16le_yuv444p +struct Convert_semiplanar16_planar16 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; - typedef uchar out_T; - typedef uchar out_T_uv; + typedef ushort2 in_T_uv; + typedef ushort out_T; + typedef ushort out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); + DEFAULT_DST(0) =3D SUB_F(y, 0); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D conv_16to8(SUB_F(uv, 1)); - DEFAULT_DST(2) =3D conv_16to8(SUB_F(uv, 2)); + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D res.x; + DEFAULT_DST(2) =3D res.y; } }; -struct Convert_yuv444p16le_p010le +struct Convert_semiplanar16_semiplanar8 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; - typedef ushort out_T; - typedef ushort2 out_T_uv; + typedef ushort2 in_T_uv; + typedef uchar out_T; + typedef uchar2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); + DEFAULT_DST(0) =3D conv_16to8(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { - DEFAULT_DST(1) =3D make_ushort2( - conv_16to10(SUB_F(uv, 1)), - conv_16to10(SUB_F(uv, 2)) + in_T_uv res =3D SUB_F(uv, 1); + DEFAULT_DST(1) =3D make_uchar2( + conv_16to8(res.x), + conv_16to8(res.y) ); } }; -struct Convert_yuv444p16le_p016le +struct Convert_semiplanar16_semiplanar10 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; + typedef ushort2 in_T_uv; typedef ushort out_T; typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { - DEFAULT_DST(0) =3D SUB_F(y, 0); + DEFAULT_DST(0) =3D conv_16to10(SUB_F(y, 0)); } DEF_F(Convert_uv, out_T_uv) { + in_T_uv res =3D SUB_F(uv, 1); DEFAULT_DST(1) =3D make_ushort2( - SUB_F(uv, 1), - SUB_F(uv, 2) + conv_16to10(res.x), + conv_16to10(res.y) ); } }; -struct Convert_yuv444p16le_yuv444p16le +struct Convert_semiplanar16_semiplanar16 { static const int in_bit_depth =3D 16; typedef ushort in_T; - typedef ushort in_T_uv; + typedef ushort2 in_T_uv; typedef ushort out_T; - typedef ushort out_T_uv; + typedef ushort2 out_T_uv; DEF_F(Convert, out_T) { @@ -851,7 +875,6 @@ struct Convert_yuv444p16le_yuv444p16le DEF_F(Convert_uv, out_T_uv) { DEFAULT_DST(1) =3D SUB_F(uv, 1); - DEFAULT_DST(2) =3D SUB_F(uv, 2); } }; @@ -1184,12 +1207,12 @@ extern "C" { NEAREST_KERNEL(C,_uv) #define NEAREST_KERNELS(C) \ - NEAREST_KERNEL_RAW(yuv420p_ ## C) \ - NEAREST_KERNEL_RAW(nv12_ ## C) \ - NEAREST_KERNEL_RAW(yuv444p_ ## C) \ - NEAREST_KERNEL_RAW(p010le_ ## C) \ - NEAREST_KERNEL_RAW(p016le_ ## C) \ - NEAREST_KERNEL_RAW(yuv444p16le_ ## C) + NEAREST_KERNEL_RAW(planar8_ ## C) \ + NEAREST_KERNEL_RAW(planar10_ ## C) \ + NEAREST_KERNEL_RAW(planar16_ ## C) \ + NEAREST_KERNEL_RAW(semiplanar8_ ## C) \ + NEAREST_KERNEL_RAW(semiplanar10_ ## C) \ + NEAREST_KERNEL_RAW(semiplanar16_ ## C) #define NEAREST_KERNELS_RGB(C) \ NEAREST_KERNEL_RAW(rgb0_ ## C) \ @@ -1197,12 +1220,12 @@ extern "C" { NEAREST_KERNEL_RAW(rgba_ ## C) \ NEAREST_KERNEL_RAW(bgra_ ## C) \ -NEAREST_KERNELS(yuv420p) -NEAREST_KERNELS(nv12) -NEAREST_KERNELS(yuv444p) -NEAREST_KERNELS(p010le) -NEAREST_KERNELS(p016le) -NEAREST_KERNELS(yuv444p16le) +NEAREST_KERNELS(planar8) +NEAREST_KERNELS(planar10) +NEAREST_KERNELS(planar16) +NEAREST_KERNELS(semiplanar8) +NEAREST_KERNELS(semiplanar10) +NEAREST_KERNELS(semiplanar16) NEAREST_KERNELS_RGB(rgb0) NEAREST_KERNELS_RGB(bgr0) @@ -1224,12 +1247,12 @@ NEAREST_KERNELS_RGB(bgra) BILINEAR_KERNEL(C,_uv) #define BILINEAR_KERNELS(C) \ - BILINEAR_KERNEL_RAW(yuv420p_ ## C) \ - BILINEAR_KERNEL_RAW(nv12_ ## C) \ - BILINEAR_KERNEL_RAW(yuv444p_ ## C) \ - BILINEAR_KERNEL_RAW(p010le_ ## C) \ - BILINEAR_KERNEL_RAW(p016le_ ## C) \ - BILINEAR_KERNEL_RAW(yuv444p16le_ ## C) + BILINEAR_KERNEL_RAW(planar8_ ## C) \ + BILINEAR_KERNEL_RAW(planar10_ ## C) \ + BILINEAR_KERNEL_RAW(planar16_ ## C) \ + BILINEAR_KERNEL_RAW(semiplanar8_ ## C) \ + BILINEAR_KERNEL_RAW(semiplanar10_ ## C) \ + BILINEAR_KERNEL_RAW(semiplanar16_ ## C) #define BILINEAR_KERNELS_RGB(C) \ BILINEAR_KERNEL_RAW(rgb0_ ## C) \ @@ -1237,12 +1260,12 @@ NEAREST_KERNELS_RGB(bgra) BILINEAR_KERNEL_RAW(rgba_ ## C) \ BILINEAR_KERNEL_RAW(bgra_ ## C) -BILINEAR_KERNELS(yuv420p) -BILINEAR_KERNELS(nv12) -BILINEAR_KERNELS(yuv444p) -BILINEAR_KERNELS(p010le) -BILINEAR_KERNELS(p016le) -BILINEAR_KERNELS(yuv444p16le) +BILINEAR_KERNELS(planar8) +BILINEAR_KERNELS(planar10) +BILINEAR_KERNELS(planar16) +BILINEAR_KERNELS(semiplanar8) +BILINEAR_KERNELS(semiplanar10) +BILINEAR_KERNELS(semiplanar16) BILINEAR_KERNELS_RGB(rgb0) BILINEAR_KERNELS_RGB(bgr0) @@ -1264,12 +1287,12 @@ BILINEAR_KERNELS_RGB(bgra) BICUBIC_KERNEL(C,_uv) #define BICUBIC_KERNELS(C) \ - BICUBIC_KERNEL_RAW(yuv420p_ ## C) \ - BICUBIC_KERNEL_RAW(nv12_ ## C) \ - BICUBIC_KERNEL_RAW(yuv444p_ ## C) \ - BICUBIC_KERNEL_RAW(p010le_ ## C) \ - BICUBIC_KERNEL_RAW(p016le_ ## C) \ - BICUBIC_KERNEL_RAW(yuv444p16le_ ## C) + BICUBIC_KERNEL_RAW(planar8_ ## C) \ + BICUBIC_KERNEL_RAW(planar10_ ## C) \ + BICUBIC_KERNEL_RAW(planar16_ ## C) \ + BICUBIC_KERNEL_RAW(semiplanar8_ ## C) \ + BICUBIC_KERNEL_RAW(semiplanar10_ ## C) \ + BICUBIC_KERNEL_RAW(semiplanar16_ ## C) #define BICUBIC_KERNELS_RGB(C) \ BICUBIC_KERNEL_RAW(rgb0_ ## C) \ @@ -1277,12 +1300,12 @@ BILINEAR_KERNELS_RGB(bgra) BICUBIC_KERNEL_RAW(rgba_ ## C) \ BICUBIC_KERNEL_RAW(bgra_ ## C) -BICUBIC_KERNELS(yuv420p) -BICUBIC_KERNELS(nv12) -BICUBIC_KERNELS(yuv444p) -BICUBIC_KERNELS(p010le) -BICUBIC_KERNELS(p016le) -BICUBIC_KERNELS(yuv444p16le) +BICUBIC_KERNELS(planar8) +BICUBIC_KERNELS(planar10) +BICUBIC_KERNELS(planar16) +BICUBIC_KERNELS(semiplanar8) +BICUBIC_KERNELS(semiplanar10) +BICUBIC_KERNELS(semiplanar16) BICUBIC_KERNELS_RGB(rgb0) BICUBIC_KERNELS_RGB(bgr0) @@ -1304,12 +1327,12 @@ BICUBIC_KERNELS_RGB(bgra) LANCZOS_KERNEL(C,_uv) #define LANCZOS_KERNELS(C) \ - LANCZOS_KERNEL_RAW(yuv420p_ ## C) \ - LANCZOS_KERNEL_RAW(nv12_ ## C) \ - LANCZOS_KERNEL_RAW(yuv444p_ ## C) \ - LANCZOS_KERNEL_RAW(p010le_ ## C) \ - LANCZOS_KERNEL_RAW(p016le_ ## C) \ - LANCZOS_KERNEL_RAW(yuv444p16le_ ## C) + LANCZOS_KERNEL_RAW(planar8_ ## C) \ + LANCZOS_KERNEL_RAW(planar10_ ## C) \ + LANCZOS_KERNEL_RAW(planar16_ ## C) \ + LANCZOS_KERNEL_RAW(semiplanar8_ ## C) \ + LANCZOS_KERNEL_RAW(semiplanar10_ ## C) \ + LANCZOS_KERNEL_RAW(semiplanar16_ ## C) #define LANCZOS_KERNELS_RGB(C) \ LANCZOS_KERNEL_RAW(rgb0_ ## C) \ @@ -1317,12 +1340,12 @@ BICUBIC_KERNELS_RGB(bgra) LANCZOS_KERNEL_RAW(rgba_ ## C) \ LANCZOS_KERNEL_RAW(bgra_ ## C) -LANCZOS_KERNELS(yuv420p) -LANCZOS_KERNELS(nv12) -LANCZOS_KERNELS(yuv444p) -LANCZOS_KERNELS(p010le) -LANCZOS_KERNELS(p016le) -LANCZOS_KERNELS(yuv444p16le) +LANCZOS_KERNELS(planar8) +LANCZOS_KERNELS(planar10) +LANCZOS_KERNELS(planar16) +LANCZOS_KERNELS(semiplanar8) +LANCZOS_KERNELS(semiplanar10) +LANCZOS_KERNELS(semiplanar16) LANCZOS_KERNELS_RGB(rgb0) LANCZOS_KERNELS_RGB(bgr0) -- 2.39.5 (Apple Git-154) ---------------------------------------------------------------------------= -------- NVIDIA GmbH Wuerselen Amtsgericht Aachen HRB 8361 Managing Directors: Rebecca Peters, Donald Robertson, Janet Hall, Ludwig vo= n Reiche ---------------------------------------------------------------------------= -------- This email message is for the sole use of the intended recipient(s) and may= contain confidential information. Any unauthorized review, use, disclosure or dist= ribution is prohibited. If you are not the intended recipient, please contact the s= ender by reply email and destroy all copies of the original message. ---------------------------------------------------------------------------= -------- --===============6439331732457473877== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". --===============6439331732457473877==--