From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 371DB4C7F1 for ; Wed, 26 Nov 2025 15:27:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1764170834; h=date : to : message-id : mime-version : content-type : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : from; bh=NH1srTu84CzPPNNc0OlAHvJb5LL66fUwY6pDHWmxyiI=; b=bRx92awW8LWQCzvzDWi11IAdVznp0ja7RENYmWDuMzmtndXyuY+z5Am/sRUxI+5pC7QuU MBaz0a9rsCPucWHRNkJhMN2osQFanQI0SXnO26Pz6+JOqPncOZjbl7nERGA/RRBfv9tD8k5 xB2IWMksaGqf4hZk/KjGpP4IRa57IqQ7/y0wCwHR9gVO15H/v+YTUwcG/N0nlvJEVi2cf+a HsAIfzBJW8oz+9Z+be11z43iDDZz5J0gFDa+NveueAnh0Lu9xJ70nxzFHkz/U6uO87aRSt7 NvPK3Re3GIPxFWbDFQI1SU+xhw2IJwOU5jCO0glYXAUuSlaOkUQKUJWm35aA== Received: from [172.19.0.3] (unknown [172.19.0.3]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id C59B36901E9; Wed, 26 Nov 2025 17:27:14 +0200 (EET) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1764170813; b=iahQHY+AiDuERGekbwDYDOUmj7t/b19h1qE+02XFmzzuhzsS/53ji5FLMjA2IUK+mT+kf EU8m3bqmzDrrf+IUAkg13bnNGKdH84zR+FjztW0t+llrHtglgfRRD44seMGojUno3RPtUnZ WQpomg++bHnQLR2wVOusJYGKv3nu0Y4WeVfIoZBhNevx4H+o4omDfg2IfXeGjvsnnG72ybE Hqtpycg/3xJWTLR+v5JLZ1xyl/B75uoonzE3Lu1+MdMqa6FRaBrp8peHl2jMbU+O6zba89Z ZdanKGOtAFZb7mH48M6oLBL3j8odr4DxuaU1DFGhsuqcsFbosQAhKhksvcYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1764170813; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=62v/s/jbYMqv7b19YoINCxfadoah12eMvJCF/F7xytg=; b=iOyratslfgcdA6331vCUhquXDZWzeSPdXGcdJRsPw8W++wHagtYFdfnauGMzLpvkxYzsm NHBXHRgy+9zEH/mKHppndxThaKPVwWqwjUN8rPAb9kDxNYej2XEJwmuBL9WfzfIqvFfrhQ9 ivwYU3dKTH8ur8adQy6Dl1/S6XkhNK+e30N1dnxR+f1t3dkq4jvMoZDOjaR72E2L7xC5LZs lJLqENIf1g8m3wwLTdElIZf1/HSMpL3gjDWpYNyHxyOAtAulYLSgwfZ159hOFkU1pqfdXv8 pMR0UUK6C9ZP2fd27hAbWj/d0qwZSh1Kc2tK1Qap4/0KqMrJpSSzQ/tRATJg== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=fail; arc=none; dmarc=fail header.from=arm.com policy.dmarc=none Authentication-Results: ffmpeg.org; dkim=fail; arc=none (Message is not ARC signed); dmarc=fail (Used From Domain Record) header.from=arm.com policy.dmarc=none Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 84FBC6900AC for ; Wed, 26 Nov 2025 17:26:41 +0200 (EET) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 25DEB168F; Wed, 26 Nov 2025 07:26:33 -0800 (PST) Received: from spark-6072 (spark-6072.cambridge.arm.com [10.1.200.52]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1822D3F66E; Wed, 26 Nov 2025 07:26:39 -0800 (PST) Date: Wed, 26 Nov 2025 15:26:30 +0000 To: ffmpeg-devel@ffmpeg.org Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="xpno2suwenk3gnso" Content-Disposition: inline Message-ID-Hash: 2FQT47FW7V2VKNBRSUNM5365LR6CBTUM X-Message-ID-Hash: 2FQT47FW7V2VKNBRSUNM5365LR6CBTUM X-MailFrom: SRS0=tbAF=6C=arm.com=Arpad.Panyik@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH 1/3] swscale: Refactor XYZ+RGB state and add xyz12Torgb48 hook List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Arpad Panyik via ffmpeg-devel Cc: nd@arm.com, Arpad Panyik Archived-At: List-Archive: List-Post: --xpno2suwenk3gnso Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Prepare for xyz12Torgb48 architecture-specific optimizations in subsequent patches by: - Grouping XYZ+RGB gamma LUTs and 3x3 matrices into ColorXform (ctx->xyz2rgb/ ctx->rgb2xyz), replacing scattered fields. - Dropping the unused last matrix column giving the same or smaller SwsInternal size. - Renaming ff_xyz12Torgb48 to xyz12Torgb48_c and routing calls via the new per-context function pointer (ctx->xyz12Torgb48) in graph.c and swscale.c. - Adding ff_sws_init_xyz2rgb and invoking it in swscale init paths (normal and unscaled). These modifications do not introduce any functional changes. Signed-off-by: Arpad Panyik --- libswscale/graph.c | 3 +- libswscale/swscale.c | 85 +++++++++++++++++++---------------- libswscale/swscale_internal.h | 25 +++++++---- libswscale/swscale_unscaled.c | 2 + libswscale/utils.c | 33 +++++++------- 5 files changed, 83 insertions(+), 65 deletions(-) diff --git a/libswscale/graph.c b/libswscale/graph.c index 0a79b17f89..60ead6e8bb 100644 --- a/libswscale/graph.c +++ b/libswscale/graph.c @@ -142,7 +142,8 @@ static void run_rgb0(const SwsImg *out, const SwsImg *in, int y, int h, static void run_xyz2rgb(const SwsImg *out, const SwsImg *in, int y, int h, const SwsPass *pass) { - ff_xyz12Torgb48(pass->priv, out->data[0] + y * out->linesize[0], out->linesize[0], + const SwsInternal *c = pass->priv; + c->xyz12Torgb48(c, out->data[0] + y * out->linesize[0], out->linesize[0], in->data[0] + y * in->linesize[0], in->linesize[0], pass->width, h); } diff --git a/libswscale/swscale.c b/libswscale/swscale.c index f4c7eccac4..c795427a83 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -660,6 +660,8 @@ static av_cold void sws_init_swscale(SwsInternal *c) { enum AVPixelFormat srcFormat = c->opts.src_format; + ff_sws_init_xyz2rgb(c); + ff_sws_init_output_funcs(c, &c->yuv2plane1, &c->yuv2planeX, &c->yuv2nv12cX, &c->yuv2packed1, &c->yuv2packed2, &c->yuv2packedX, &c->yuv2anyX); @@ -737,8 +739,8 @@ static int check_image_pointers(const uint8_t * const data[4], enum AVPixelForma return 1; } -void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, - const uint8_t *src, int src_stride, int w, int h) +static void xyz12Torgb48_c(const SwsInternal *c, uint8_t *dst, int dst_stride, + const uint8_t *src, int src_stride, int w, int h) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(c->opts.src_format); @@ -759,20 +761,20 @@ void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, z = AV_RL16(src16 + xp + 2); } - x = c->xyzgamma[x >> 4]; - y = c->xyzgamma[y >> 4]; - z = c->xyzgamma[z >> 4]; + x = c->xyz2rgb.gamma.xyz[x >> 4]; + y = c->xyz2rgb.gamma.xyz[y >> 4]; + z = c->xyz2rgb.gamma.xyz[z >> 4]; // convert from XYZlinear to sRGBlinear - r = c->xyz2rgb_matrix[0][0] * x + - c->xyz2rgb_matrix[0][1] * y + - c->xyz2rgb_matrix[0][2] * z >> 12; - g = c->xyz2rgb_matrix[1][0] * x + - c->xyz2rgb_matrix[1][1] * y + - c->xyz2rgb_matrix[1][2] * z >> 12; - b = c->xyz2rgb_matrix[2][0] * x + - c->xyz2rgb_matrix[2][1] * y + - c->xyz2rgb_matrix[2][2] * z >> 12; + r = c->xyz2rgb.matrix[0][0] * x + + c->xyz2rgb.matrix[0][1] * y + + c->xyz2rgb.matrix[0][2] * z >> 12; + g = c->xyz2rgb.matrix[1][0] * x + + c->xyz2rgb.matrix[1][1] * y + + c->xyz2rgb.matrix[1][2] * z >> 12; + b = c->xyz2rgb.matrix[2][0] * x + + c->xyz2rgb.matrix[2][1] * y + + c->xyz2rgb.matrix[2][2] * z >> 12; // limit values to 16-bit depth r = av_clip_uint16(r); @@ -781,13 +783,13 @@ void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, // convert from sRGBlinear to RGB and scale from 12bit to 16bit if (desc->flags & AV_PIX_FMT_FLAG_BE) { - AV_WB16(dst16 + xp + 0, c->rgbgamma[r] << 4); - AV_WB16(dst16 + xp + 1, c->rgbgamma[g] << 4); - AV_WB16(dst16 + xp + 2, c->rgbgamma[b] << 4); + AV_WB16(dst16 + xp + 0, c->xyz2rgb.gamma.rgb[r] << 4); + AV_WB16(dst16 + xp + 1, c->xyz2rgb.gamma.rgb[g] << 4); + AV_WB16(dst16 + xp + 2, c->xyz2rgb.gamma.rgb[b] << 4); } else { - AV_WL16(dst16 + xp + 0, c->rgbgamma[r] << 4); - AV_WL16(dst16 + xp + 1, c->rgbgamma[g] << 4); - AV_WL16(dst16 + xp + 2, c->rgbgamma[b] << 4); + AV_WL16(dst16 + xp + 0, c->xyz2rgb.gamma.rgb[r] << 4); + AV_WL16(dst16 + xp + 1, c->xyz2rgb.gamma.rgb[g] << 4); + AV_WL16(dst16 + xp + 2, c->xyz2rgb.gamma.rgb[b] << 4); } } @@ -818,20 +820,20 @@ void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, b = AV_RL16(src16 + xp + 2); } - r = c->rgbgammainv[r>>4]; - g = c->rgbgammainv[g>>4]; - b = c->rgbgammainv[b>>4]; + r = c->rgb2xyz.gamma.rgb[r >> 4]; + g = c->rgb2xyz.gamma.rgb[g >> 4]; + b = c->rgb2xyz.gamma.rgb[b >> 4]; // convert from sRGBlinear to XYZlinear - x = c->rgb2xyz_matrix[0][0] * r + - c->rgb2xyz_matrix[0][1] * g + - c->rgb2xyz_matrix[0][2] * b >> 12; - y = c->rgb2xyz_matrix[1][0] * r + - c->rgb2xyz_matrix[1][1] * g + - c->rgb2xyz_matrix[1][2] * b >> 12; - z = c->rgb2xyz_matrix[2][0] * r + - c->rgb2xyz_matrix[2][1] * g + - c->rgb2xyz_matrix[2][2] * b >> 12; + x = c->rgb2xyz.matrix[0][0] * r + + c->rgb2xyz.matrix[0][1] * g + + c->rgb2xyz.matrix[0][2] * b >> 12; + y = c->rgb2xyz.matrix[1][0] * r + + c->rgb2xyz.matrix[1][1] * g + + c->rgb2xyz.matrix[1][2] * b >> 12; + z = c->rgb2xyz.matrix[2][0] * r + + c->rgb2xyz.matrix[2][1] * g + + c->rgb2xyz.matrix[2][2] * b >> 12; // limit values to 16-bit depth x = av_clip_uint16(x); @@ -840,13 +842,13 @@ void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, // convert from XYZlinear to X'Y'Z' and scale from 12bit to 16bit if (desc->flags & AV_PIX_FMT_FLAG_BE) { - AV_WB16(dst16 + xp + 0, c->xyzgammainv[x] << 4); - AV_WB16(dst16 + xp + 1, c->xyzgammainv[y] << 4); - AV_WB16(dst16 + xp + 2, c->xyzgammainv[z] << 4); + AV_WB16(dst16 + xp + 0, c->rgb2xyz.gamma.xyz[x] << 4); + AV_WB16(dst16 + xp + 1, c->rgb2xyz.gamma.xyz[y] << 4); + AV_WB16(dst16 + xp + 2, c->rgb2xyz.gamma.xyz[z] << 4); } else { - AV_WL16(dst16 + xp + 0, c->xyzgammainv[x] << 4); - AV_WL16(dst16 + xp + 1, c->xyzgammainv[y] << 4); - AV_WL16(dst16 + xp + 2, c->xyzgammainv[z] << 4); + AV_WL16(dst16 + xp + 0, c->rgb2xyz.gamma.xyz[x] << 4); + AV_WL16(dst16 + xp + 1, c->rgb2xyz.gamma.xyz[y] << 4); + AV_WL16(dst16 + xp + 2, c->rgb2xyz.gamma.xyz[z] << 4); } } @@ -855,6 +857,11 @@ void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, } } +av_cold void ff_sws_init_xyz2rgb(SwsInternal *c) +{ + c->xyz12Torgb48 = xyz12Torgb48_c; +} + void ff_update_palette(SwsInternal *c, const uint32_t *pal) { for (int i = 0; i < 256; i++) { @@ -1110,7 +1117,7 @@ static int scale_internal(SwsContext *sws, base = srcStride[0] < 0 ? c->xyz_scratch - srcStride[0] * (srcSliceH-1) : c->xyz_scratch; - ff_xyz12Torgb48(c, base, srcStride[0], src2[0], srcStride[0], sws->src_w, srcSliceH); + c->xyz12Torgb48(c, base, srcStride[0], src2[0], srcStride[0], sws->src_w, srcSliceH); src2[0] = base; } diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 5dd65a8d71..107671feb2 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -93,6 +93,16 @@ typedef int (*SwsFunc)(SwsInternal *c, const uint8_t *const src[], const int srcStride[], int srcSliceY, int srcSliceH, uint8_t *const dst[], const int dstStride[]); +typedef struct GammaLuts { + uint16_t *xyz; + uint16_t *rgb; +} GammaLuts; + +typedef struct ColorXform { + GammaLuts gamma; + int16_t matrix[3][3]; +} ColorXform; + /** * Write one line of horizontally scaled data to planar output * without any additional vertical scaling (or point-scaling). @@ -547,12 +557,10 @@ struct SwsInternal { /* pre defined color-spaces gamma */ #define XYZ_GAMMA (2.6) #define RGB_GAMMA (2.2) - uint16_t *xyzgamma; - uint16_t *rgbgamma; - uint16_t *xyzgammainv; - uint16_t *rgbgammainv; - int16_t xyz2rgb_matrix[3][4]; - int16_t rgb2xyz_matrix[3][4]; + void (*xyz12Torgb48)(const SwsInternal *c, uint8_t *dst, int dst_stride, + const uint8_t *src, int src_stride, int w, int h); + ColorXform xyz2rgb; + ColorXform rgb2xyz; /* function pointers for swscale() */ yuv2planar1_fn yuv2plane1; @@ -720,6 +728,8 @@ av_cold void ff_sws_init_range_convert_loongarch(SwsInternal *c); av_cold void ff_sws_init_range_convert_riscv(SwsInternal *c); av_cold void ff_sws_init_range_convert_x86(SwsInternal *c); +av_cold void ff_sws_init_xyz2rgb(SwsInternal *c); + SwsFunc ff_yuv2rgb_init_x86(SwsInternal *c); SwsFunc ff_yuv2rgb_init_ppc(SwsInternal *c); SwsFunc ff_yuv2rgb_init_loongarch(SwsInternal *c); @@ -1043,9 +1053,6 @@ void ff_copyPlane(const uint8_t *src, int srcStride, int srcSliceY, int srcSliceH, int width, uint8_t *dst, int dstStride); -void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, - const uint8_t *src, int src_stride, int w, int h); - void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, const uint8_t *src, int src_stride, int w, int h); diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 2c791e89fe..7be0690882 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -2685,6 +2685,8 @@ void ff_get_unscaled_swscale(SwsInternal *c) } } + ff_sws_init_xyz2rgb(c); + #if ARCH_PPC ff_get_unscaled_swscale_ppc(c); #elif ARCH_ARM diff --git a/libswscale/utils.c b/libswscale/utils.c index a13d8df7e8..79de0ea9c9 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -721,34 +721,35 @@ static av_cold void init_xyz_tables(void) static int fill_xyztables(SwsInternal *c) { - static const int16_t xyz2rgb_matrix[3][4] = { + static const int16_t xyz2rgb_matrix[3][3] = { {13270, -6295, -2041}, {-3969, 7682, 170}, { 228, -835, 4329} }; - static const int16_t rgb2xyz_matrix[3][4] = { + static const int16_t rgb2xyz_matrix[3][3] = { {1689, 1464, 739}, { 871, 2929, 296}, { 79, 488, 3891} }; - if (c->xyzgamma) + if (c->xyz2rgb.gamma.xyz) return 0; - memcpy(c->xyz2rgb_matrix, xyz2rgb_matrix, sizeof(c->xyz2rgb_matrix)); - memcpy(c->rgb2xyz_matrix, rgb2xyz_matrix, sizeof(c->rgb2xyz_matrix)); + memcpy(c->xyz2rgb.matrix, xyz2rgb_matrix, sizeof(c->xyz2rgb.matrix)); + memcpy(c->rgb2xyz.matrix, rgb2xyz_matrix, sizeof(c->rgb2xyz.matrix)); #if CONFIG_SMALL - c->xyzgamma = av_malloc(sizeof(uint16_t) * 2 * (4096 + 65536)); - if (!c->xyzgamma) + c->xyz2rgb.gamma.xyz = av_malloc(sizeof(uint16_t) * 2 * (4096 + 65536)); + if (!c->xyz2rgb.gamma.xyz) return AVERROR(ENOMEM); - c->rgbgammainv = c->xyzgamma + 4096; - c->rgbgamma = c->rgbgammainv + 4096; - c->xyzgammainv = c->rgbgamma + 65536; - init_xyz_tables(c->xyzgamma, c->xyzgammainv, c->rgbgamma, c->rgbgammainv); + c->rgb2xyz.gamma.rgb = c->xyz2rgb.gamma.xyz + 4096; + c->xyz2rgb.gamma.rgb = c->rgb2xyz.gamma.rgb + 4096; + c->rgb2xyz.gamma.xyz = c->xyz2rgb.gamma.rgb + 65536; + init_xyz_tables(c->xyz2rgb.gamma.xyz, c->rgb2xyz.gamma.xyz, + c->xyz2rgb.gamma.rgb, c->rgb2xyz.gamma.rgb); #else - c->xyzgamma = xyzgamma_tab; - c->rgbgamma = rgbgamma_tab; - c->xyzgammainv = xyzgammainv_tab; - c->rgbgammainv = rgbgammainv_tab; + c->xyz2rgb.gamma.xyz = xyzgamma_tab; + c->xyz2rgb.gamma.rgb = rgbgamma_tab; + c->rgb2xyz.gamma.xyz = xyzgammainv_tab; + c->rgb2xyz.gamma.rgb = rgbgammainv_tab; static AVOnce xyz_init_static_once = AV_ONCE_INIT; ff_thread_once(&xyz_init_static_once, init_xyz_tables); @@ -2312,7 +2313,7 @@ void sws_freeContext(SwsContext *sws) av_freep(&c->gamma); av_freep(&c->inv_gamma); #if CONFIG_SMALL - av_freep(&c->xyzgamma); + av_freep(&c->xyz2rgb.gamma.xyz); #endif av_freep(&c->rgb0_scratch); -- 2.43.0 --xpno2suwenk3gnso Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="0001-swscale-Refactor-XYZ-RGB-state-and-add-xyz12Torgb48-.patch" >>From 35e3d4ae2eaa80c1d291bdeeafa11503d7b6e875 Mon Sep 17 00:00:00 2001 From: Arpad Panyik Date: Wed, 26 Nov 2025 09:25:53 +0000 Subject: [PATCH 1/3] swscale: Refactor XYZ+RGB state and add xyz12Torgb48 hook Prepare for xyz12Torgb48 architecture-specific optimizations in subsequent patches by: - Grouping XYZ+RGB gamma LUTs and 3x3 matrices into ColorXform (ctx->xyz2rgb/ ctx->rgb2xyz), replacing scattered fields. - Dropping the unused last matrix column giving the same or smaller SwsInternal size. - Renaming ff_xyz12Torgb48 to xyz12Torgb48_c and routing calls via the new per-context function pointer (ctx->xyz12Torgb48) in graph.c and swscale.c. - Adding ff_sws_init_xyz2rgb and invoking it in swscale init paths (normal and unscaled). These modifications do not introduce any functional changes. Signed-off-by: Arpad Panyik --- libswscale/graph.c | 3 +- libswscale/swscale.c | 85 +++++++++++++++++++---------------- libswscale/swscale_internal.h | 25 +++++++---- libswscale/swscale_unscaled.c | 2 + libswscale/utils.c | 33 +++++++------- 5 files changed, 83 insertions(+), 65 deletions(-) diff --git a/libswscale/graph.c b/libswscale/graph.c index 0a79b17f89..60ead6e8bb 100644 --- a/libswscale/graph.c +++ b/libswscale/graph.c @@ -142,7 +142,8 @@ static void run_rgb0(const SwsImg *out, const SwsImg *in, int y, int h, static void run_xyz2rgb(const SwsImg *out, const SwsImg *in, int y, int h, const SwsPass *pass) { - ff_xyz12Torgb48(pass->priv, out->data[0] + y * out->linesize[0], out->linesize[0], + const SwsInternal *c = pass->priv; + c->xyz12Torgb48(c, out->data[0] + y * out->linesize[0], out->linesize[0], in->data[0] + y * in->linesize[0], in->linesize[0], pass->width, h); } diff --git a/libswscale/swscale.c b/libswscale/swscale.c index f4c7eccac4..c795427a83 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -660,6 +660,8 @@ static av_cold void sws_init_swscale(SwsInternal *c) { enum AVPixelFormat srcFormat = c->opts.src_format; + ff_sws_init_xyz2rgb(c); + ff_sws_init_output_funcs(c, &c->yuv2plane1, &c->yuv2planeX, &c->yuv2nv12cX, &c->yuv2packed1, &c->yuv2packed2, &c->yuv2packedX, &c->yuv2anyX); @@ -737,8 +739,8 @@ static int check_image_pointers(const uint8_t * const data[4], enum AVPixelForma return 1; } -void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, - const uint8_t *src, int src_stride, int w, int h) +static void xyz12Torgb48_c(const SwsInternal *c, uint8_t *dst, int dst_stride, + const uint8_t *src, int src_stride, int w, int h) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(c->opts.src_format); @@ -759,20 +761,20 @@ void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, z = AV_RL16(src16 + xp + 2); } - x = c->xyzgamma[x >> 4]; - y = c->xyzgamma[y >> 4]; - z = c->xyzgamma[z >> 4]; + x = c->xyz2rgb.gamma.xyz[x >> 4]; + y = c->xyz2rgb.gamma.xyz[y >> 4]; + z = c->xyz2rgb.gamma.xyz[z >> 4]; // convert from XYZlinear to sRGBlinear - r = c->xyz2rgb_matrix[0][0] * x + - c->xyz2rgb_matrix[0][1] * y + - c->xyz2rgb_matrix[0][2] * z >> 12; - g = c->xyz2rgb_matrix[1][0] * x + - c->xyz2rgb_matrix[1][1] * y + - c->xyz2rgb_matrix[1][2] * z >> 12; - b = c->xyz2rgb_matrix[2][0] * x + - c->xyz2rgb_matrix[2][1] * y + - c->xyz2rgb_matrix[2][2] * z >> 12; + r = c->xyz2rgb.matrix[0][0] * x + + c->xyz2rgb.matrix[0][1] * y + + c->xyz2rgb.matrix[0][2] * z >> 12; + g = c->xyz2rgb.matrix[1][0] * x + + c->xyz2rgb.matrix[1][1] * y + + c->xyz2rgb.matrix[1][2] * z >> 12; + b = c->xyz2rgb.matrix[2][0] * x + + c->xyz2rgb.matrix[2][1] * y + + c->xyz2rgb.matrix[2][2] * z >> 12; // limit values to 16-bit depth r = av_clip_uint16(r); @@ -781,13 +783,13 @@ void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, // convert from sRGBlinear to RGB and scale from 12bit to 16bit if (desc->flags & AV_PIX_FMT_FLAG_BE) { - AV_WB16(dst16 + xp + 0, c->rgbgamma[r] << 4); - AV_WB16(dst16 + xp + 1, c->rgbgamma[g] << 4); - AV_WB16(dst16 + xp + 2, c->rgbgamma[b] << 4); + AV_WB16(dst16 + xp + 0, c->xyz2rgb.gamma.rgb[r] << 4); + AV_WB16(dst16 + xp + 1, c->xyz2rgb.gamma.rgb[g] << 4); + AV_WB16(dst16 + xp + 2, c->xyz2rgb.gamma.rgb[b] << 4); } else { - AV_WL16(dst16 + xp + 0, c->rgbgamma[r] << 4); - AV_WL16(dst16 + xp + 1, c->rgbgamma[g] << 4); - AV_WL16(dst16 + xp + 2, c->rgbgamma[b] << 4); + AV_WL16(dst16 + xp + 0, c->xyz2rgb.gamma.rgb[r] << 4); + AV_WL16(dst16 + xp + 1, c->xyz2rgb.gamma.rgb[g] << 4); + AV_WL16(dst16 + xp + 2, c->xyz2rgb.gamma.rgb[b] << 4); } } @@ -818,20 +820,20 @@ void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, b = AV_RL16(src16 + xp + 2); } - r = c->rgbgammainv[r>>4]; - g = c->rgbgammainv[g>>4]; - b = c->rgbgammainv[b>>4]; + r = c->rgb2xyz.gamma.rgb[r >> 4]; + g = c->rgb2xyz.gamma.rgb[g >> 4]; + b = c->rgb2xyz.gamma.rgb[b >> 4]; // convert from sRGBlinear to XYZlinear - x = c->rgb2xyz_matrix[0][0] * r + - c->rgb2xyz_matrix[0][1] * g + - c->rgb2xyz_matrix[0][2] * b >> 12; - y = c->rgb2xyz_matrix[1][0] * r + - c->rgb2xyz_matrix[1][1] * g + - c->rgb2xyz_matrix[1][2] * b >> 12; - z = c->rgb2xyz_matrix[2][0] * r + - c->rgb2xyz_matrix[2][1] * g + - c->rgb2xyz_matrix[2][2] * b >> 12; + x = c->rgb2xyz.matrix[0][0] * r + + c->rgb2xyz.matrix[0][1] * g + + c->rgb2xyz.matrix[0][2] * b >> 12; + y = c->rgb2xyz.matrix[1][0] * r + + c->rgb2xyz.matrix[1][1] * g + + c->rgb2xyz.matrix[1][2] * b >> 12; + z = c->rgb2xyz.matrix[2][0] * r + + c->rgb2xyz.matrix[2][1] * g + + c->rgb2xyz.matrix[2][2] * b >> 12; // limit values to 16-bit depth x = av_clip_uint16(x); @@ -840,13 +842,13 @@ void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, // convert from XYZlinear to X'Y'Z' and scale from 12bit to 16bit if (desc->flags & AV_PIX_FMT_FLAG_BE) { - AV_WB16(dst16 + xp + 0, c->xyzgammainv[x] << 4); - AV_WB16(dst16 + xp + 1, c->xyzgammainv[y] << 4); - AV_WB16(dst16 + xp + 2, c->xyzgammainv[z] << 4); + AV_WB16(dst16 + xp + 0, c->rgb2xyz.gamma.xyz[x] << 4); + AV_WB16(dst16 + xp + 1, c->rgb2xyz.gamma.xyz[y] << 4); + AV_WB16(dst16 + xp + 2, c->rgb2xyz.gamma.xyz[z] << 4); } else { - AV_WL16(dst16 + xp + 0, c->xyzgammainv[x] << 4); - AV_WL16(dst16 + xp + 1, c->xyzgammainv[y] << 4); - AV_WL16(dst16 + xp + 2, c->xyzgammainv[z] << 4); + AV_WL16(dst16 + xp + 0, c->rgb2xyz.gamma.xyz[x] << 4); + AV_WL16(dst16 + xp + 1, c->rgb2xyz.gamma.xyz[y] << 4); + AV_WL16(dst16 + xp + 2, c->rgb2xyz.gamma.xyz[z] << 4); } } @@ -855,6 +857,11 @@ void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, } } +av_cold void ff_sws_init_xyz2rgb(SwsInternal *c) +{ + c->xyz12Torgb48 = xyz12Torgb48_c; +} + void ff_update_palette(SwsInternal *c, const uint32_t *pal) { for (int i = 0; i < 256; i++) { @@ -1110,7 +1117,7 @@ static int scale_internal(SwsContext *sws, base = srcStride[0] < 0 ? c->xyz_scratch - srcStride[0] * (srcSliceH-1) : c->xyz_scratch; - ff_xyz12Torgb48(c, base, srcStride[0], src2[0], srcStride[0], sws->src_w, srcSliceH); + c->xyz12Torgb48(c, base, srcStride[0], src2[0], srcStride[0], sws->src_w, srcSliceH); src2[0] = base; } diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 5dd65a8d71..107671feb2 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -93,6 +93,16 @@ typedef int (*SwsFunc)(SwsInternal *c, const uint8_t *const src[], const int srcStride[], int srcSliceY, int srcSliceH, uint8_t *const dst[], const int dstStride[]); +typedef struct GammaLuts { + uint16_t *xyz; + uint16_t *rgb; +} GammaLuts; + +typedef struct ColorXform { + GammaLuts gamma; + int16_t matrix[3][3]; +} ColorXform; + /** * Write one line of horizontally scaled data to planar output * without any additional vertical scaling (or point-scaling). @@ -547,12 +557,10 @@ struct SwsInternal { /* pre defined color-spaces gamma */ #define XYZ_GAMMA (2.6) #define RGB_GAMMA (2.2) - uint16_t *xyzgamma; - uint16_t *rgbgamma; - uint16_t *xyzgammainv; - uint16_t *rgbgammainv; - int16_t xyz2rgb_matrix[3][4]; - int16_t rgb2xyz_matrix[3][4]; + void (*xyz12Torgb48)(const SwsInternal *c, uint8_t *dst, int dst_stride, + const uint8_t *src, int src_stride, int w, int h); + ColorXform xyz2rgb; + ColorXform rgb2xyz; /* function pointers for swscale() */ yuv2planar1_fn yuv2plane1; @@ -720,6 +728,8 @@ av_cold void ff_sws_init_range_convert_loongarch(SwsInternal *c); av_cold void ff_sws_init_range_convert_riscv(SwsInternal *c); av_cold void ff_sws_init_range_convert_x86(SwsInternal *c); +av_cold void ff_sws_init_xyz2rgb(SwsInternal *c); + SwsFunc ff_yuv2rgb_init_x86(SwsInternal *c); SwsFunc ff_yuv2rgb_init_ppc(SwsInternal *c); SwsFunc ff_yuv2rgb_init_loongarch(SwsInternal *c); @@ -1043,9 +1053,6 @@ void ff_copyPlane(const uint8_t *src, int srcStride, int srcSliceY, int srcSliceH, int width, uint8_t *dst, int dstStride); -void ff_xyz12Torgb48(const SwsInternal *c, uint8_t *dst, int dst_stride, - const uint8_t *src, int src_stride, int w, int h); - void ff_rgb48Toxyz12(const SwsInternal *c, uint8_t *dst, int dst_stride, const uint8_t *src, int src_stride, int w, int h); diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 2c791e89fe..7be0690882 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -2685,6 +2685,8 @@ void ff_get_unscaled_swscale(SwsInternal *c) } } + ff_sws_init_xyz2rgb(c); + #if ARCH_PPC ff_get_unscaled_swscale_ppc(c); #elif ARCH_ARM diff --git a/libswscale/utils.c b/libswscale/utils.c index a13d8df7e8..79de0ea9c9 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -721,34 +721,35 @@ static av_cold void init_xyz_tables(void) static int fill_xyztables(SwsInternal *c) { - static const int16_t xyz2rgb_matrix[3][4] = { + static const int16_t xyz2rgb_matrix[3][3] = { {13270, -6295, -2041}, {-3969, 7682, 170}, { 228, -835, 4329} }; - static const int16_t rgb2xyz_matrix[3][4] = { + static const int16_t rgb2xyz_matrix[3][3] = { {1689, 1464, 739}, { 871, 2929, 296}, { 79, 488, 3891} }; - if (c->xyzgamma) + if (c->xyz2rgb.gamma.xyz) return 0; - memcpy(c->xyz2rgb_matrix, xyz2rgb_matrix, sizeof(c->xyz2rgb_matrix)); - memcpy(c->rgb2xyz_matrix, rgb2xyz_matrix, sizeof(c->rgb2xyz_matrix)); + memcpy(c->xyz2rgb.matrix, xyz2rgb_matrix, sizeof(c->xyz2rgb.matrix)); + memcpy(c->rgb2xyz.matrix, rgb2xyz_matrix, sizeof(c->rgb2xyz.matrix)); #if CONFIG_SMALL - c->xyzgamma = av_malloc(sizeof(uint16_t) * 2 * (4096 + 65536)); - if (!c->xyzgamma) + c->xyz2rgb.gamma.xyz = av_malloc(sizeof(uint16_t) * 2 * (4096 + 65536)); + if (!c->xyz2rgb.gamma.xyz) return AVERROR(ENOMEM); - c->rgbgammainv = c->xyzgamma + 4096; - c->rgbgamma = c->rgbgammainv + 4096; - c->xyzgammainv = c->rgbgamma + 65536; - init_xyz_tables(c->xyzgamma, c->xyzgammainv, c->rgbgamma, c->rgbgammainv); + c->rgb2xyz.gamma.rgb = c->xyz2rgb.gamma.xyz + 4096; + c->xyz2rgb.gamma.rgb = c->rgb2xyz.gamma.rgb + 4096; + c->rgb2xyz.gamma.xyz = c->xyz2rgb.gamma.rgb + 65536; + init_xyz_tables(c->xyz2rgb.gamma.xyz, c->rgb2xyz.gamma.xyz, + c->xyz2rgb.gamma.rgb, c->rgb2xyz.gamma.rgb); #else - c->xyzgamma = xyzgamma_tab; - c->rgbgamma = rgbgamma_tab; - c->xyzgammainv = xyzgammainv_tab; - c->rgbgammainv = rgbgammainv_tab; + c->xyz2rgb.gamma.xyz = xyzgamma_tab; + c->xyz2rgb.gamma.rgb = rgbgamma_tab; + c->rgb2xyz.gamma.xyz = xyzgammainv_tab; + c->rgb2xyz.gamma.rgb = rgbgammainv_tab; static AVOnce xyz_init_static_once = AV_ONCE_INIT; ff_thread_once(&xyz_init_static_once, init_xyz_tables); @@ -2312,7 +2313,7 @@ void sws_freeContext(SwsContext *sws) av_freep(&c->gamma); av_freep(&c->inv_gamma); #if CONFIG_SMALL - av_freep(&c->xyzgamma); + av_freep(&c->xyz2rgb.gamma.xyz); #endif av_freep(&c->rgb0_scratch); -- 2.43.0 --xpno2suwenk3gnso Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org --xpno2suwenk3gnso--