From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ffbox0-bg.ffmpeg.org (ffbox0-bg.ffmpeg.org [79.124.17.100]) by master.gitmailbox.com (Postfix) with ESMTPS id 488314C2D0 for ; Fri, 5 Sep 2025 14:22:11 +0000 (UTC) Authentication-Results: ffbox; dkim=fail (body hash mismatch (got b'ciu25LdMaC9WmKgnTsJAE0UwoJS3E/3du48sYiTA3dM=', expected b'XfpGpfgZF+FLYivXaWbTETTLYnWe8hbTPhHBJ5Dob28=')) header.d=welwarsky.de header.i=ffmpeg@welwarsky.de header.a=rsa-sha256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ffmpeg.org; i=@ffmpeg.org; q=dns/txt; s=mail; t=1757082121; h=to : date : message-id : mime-version : reply-to : subject : list-id : list-archive : list-archive : list-help : list-owner : list-post : list-subscribe : list-unsubscribe : from : cc : content-type : content-transfer-encoding : from; bh=ef3UTOi5/RZ9Mpdm6L/Xj+oHq4uNz7/+KR2T3o9UnnU=; b=q/PddUUNsUmth9kvAGzvijkhH/3helhhZ+SKLOMjB5CM70ENN/n+tZbRqIR3QYV2QMi69 D1MFGNtJCN5bMRRQe8HIuYDiIMy7fbGl+1uKZgPg5smpY1sTcKvz/nLev7gnk4aSON2Al2a yrXdBpcmYGJuAsk1gav1F0biiZc/xcKgzcDpr0IUEbWtM1rqEs7GYRMe5b1ICXXU1/AxUdC XBpyHzATLbOOn2nQN62XYyuJwSDLodA647HTO3MxPmGkh9/D91rzpgAGGBWOIi3QJlNRjk2 oHKZQWPFT+D7uIPM3B1am42LPtW/DticYnpKOkm2IfS9OcO4xh+DFu/hSTHQ== Received: from [172.19.0.4] (unknown [172.19.0.4]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTP id 4302468CD50; Fri, 5 Sep 2025 17:22:01 +0300 (EEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=ffmpeg.org; s=arc; t=1757082120; b=RS3iAkFZ1RV6NaPTG7Kt480AT+vxZo+3HcfxK/4c1o0mgA5RRecrFSp1VwwlNezVD1ajg 0EqK/SZ0qBea8yteFUlECXloCcNtcf2kVYD1IZahBPHE12Gb/c6NLwBNXVfXHQ+vhvOjFnj HWNCTPGrm7b7vnyGJEaCr1vz4kCE5j0uO3oFq91VS2YD8HUT03UvsGfJ+RZSvwwsYtdApc7 Z48l0LlvPIWUHROrGDfjKdjSFdGUreQDEtgwx0cU/BcwrmCoG0Togf0t1zJRJ51wr4SQ7zW LOFNAR7g3mflawhpUeq6szJ8yB8oD6Nr78Bcr+rwwZ9r77Lrso397Gn3Oozw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=ffmpeg.org; s=arc; t=1757082120; h=from : sender : reply-to : subject : date : message-id : to : cc : mime-version : content-type : content-transfer-encoding : content-id : content-description : resent-date : resent-from : resent-sender : resent-to : resent-cc : resent-message-id : in-reply-to : references : list-id : list-help : list-unsubscribe : list-subscribe : list-post : list-owner : list-archive; bh=ciu25LdMaC9WmKgnTsJAE0UwoJS3E/3du48sYiTA3dM=; b=kipOxpP9Dwyk0RIZN8vcLns5e+GsF3aHMaUmfRFthBFfUbcHPwz6o0tUUPDXQh3bSNQ1C 9Ha2HaRbBP83coafFgOoTLPLTXn4r8YYDJIADioGjjIWhsQA9nufSELrw66Ammju3u32xAt BzVZhL3VY6LT5HkMi+QEEE2+sZNNGo0L4SBRt4ZY/FPaDGBmCm3O/LLQqUNgEBGLQrORDSK ZcqeegEVgkLBXv1TSnNzR6pp1deVBYu2loqysL3ONH+VztJj/9a3UL6exLYPyz9N8a9WAQD 2IkbhnyETyHOvFanBch7Aq7ASbZ4sbfTjBQ70XVlHXBEKcMHgDB1sVInyVqw== ARC-Authentication-Results: i=1; ffmpeg.org; dkim=pass header.d=welwarsky.de header.i=ffmpeg@welwarsky.de; arc=none; dmarc=pass header.from=welwarsky.de policy.dmarc=none Authentication-Results: ffmpeg.org; dkim=pass header.d=welwarsky.de header.i=ffmpeg@welwarsky.de; arc=none (Message is not ARC signed); dmarc=pass (Used From Domain Record) header.from=welwarsky.de policy.dmarc=none Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.10]) by ffbox0-bg.ffmpeg.org (Postfix) with ESMTPS id A559068C15E for ; Fri, 5 Sep 2025 17:21:46 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=welwarsky.de; s=s1-ionos; t=1757082106; x=1757686906; i=ffmpeg@welwarsky.de; bh=XfpGpfgZF+FLYivXaWbTETTLYnWe8hbTPhHBJ5Dob28=; h=X-UI-Sender-Class:From:To:Subject:Date:Message-ID:MIME-Version: Content-Transfer-Encoding:Content-Type:cc: content-transfer-encoding:content-type:date:from:message-id: mime-version:reply-to:subject:to; b=Vxzi5+h4bhbJsJY+WRIMIHp2y6NUsk+6NqJ6zbRUuRmil5u8JL2obfvEeyJS20WE bfe93QnR5Uz3emwR9RkBCzx/6javgQiqUBAUjR6MCT8q1mQGtZ52D8Vm102kK26qG Q/Qn+KLnwFnUrgm+s5HigUNfxlsFsM5PsY9bun1QiZNdZr5fA0Cek2fFsZ6Hu/vJx 5w1etJRrTfgkpkMS9cJ2A97tWvZ3FBGhxXZwb0qMHNCc+BO1WZBBhG4wUNs2/J0dK RgWmimDLW36j0td0d/63zUmgojjixcp1uB6NvdhF6bFR1JtDsUuqZ2KANyuDy2SJs IcgG6mHlsGuRf78fVg== X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Received: from linux.fritz.box ([79.250.184.197]) by mrelayeu.kundenserver.de (mreue107 [213.165.67.113]) with ESMTPSA (Nemesis) id 1MAP77-1ujEm23ppT-001NWX for ; Fri, 05 Sep 2025 16:21:45 +0200 To: ffmpeg-devel@ffmpeg.org Date: Fri, 05 Sep 2025 16:21:44 +0200 Message-ID: <2756493.vuYhMxLoTh@linux.fritz.box> MIME-Version: 1.0 X-Provags-ID: V03:K1:DhDQm+Vy2xLNSJ0y3tZ+Vp4z+IIdZ1q8dAkpb4fwEgKcBlc3XyJ hKWT8lFM/u3yUJ4rA2a3g4BkX5HLrStwo7oEtqjDU0piDQPxGHxvYkQe6g16JWtb+fTFyEP koLjfocFoGv9bP1GGZ3TyV6gcfZBeCcX+xNlyhSRuDgUTHZajAh7BXnRmX/j75YacPMRR8V mlavnxzsRaHdjbfSKIg7g== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:VO2CX8dGMmA=;21tJAeNQAFfexsK+eA2ssTY1Nlu AJ6saq34hX/Amtyc4yS/hT2zLBVJaf29NEJlAhSe1fZvzNhYhTof3iSqcmthJhu30mQ449KYH J8V1mlH5fAo1Co50BPPVuXJEbwJfUqxCSWiN86h3sJBhUYZ7VUhUgttMJMwrsUcdSbhmu3DSA nrFg2BorSQR64N4bhVsg49AX9p4lsGyN8OBUCVUAdtgE79OJMPwwuAUpArYqSflpY2EBrXX9p gpc/JXbcJFAuf3gCxNgo/0g9CwGGSVMCcu9EX7KaoUJMmGBqns500MOL4mcUok/3W5MEqdLhX GNE4vaTOb1aRzW5PvlQgQ6ueD9mhuo+/p0c4GnG6JEkDxZQho5UTk7IpLL+jAAzYjPEK+MQcX FnOkQEC3WbVuHs9iPh/U1qXPFWGQMBRXm9k7p9H9MOPjEOhm105dh96jUxg+QHrgcks5xxTjl wHMDHGXleSDaOfZmijvZB8IxkDv+8JMo3LJcBTSBku/ckHm1raQ2dMU3HKUOTaI4Z5umQD5U/ zPNr4ncKPsTdmpfXgopJ7bsij5ucaCSyEhC9uzIWo676cl1r1wyXSNxRnaKvS5QBZ27d80mt/ ZsHo0G+DPsWbZVTAeF4EAecH5B4ghtk0q14sgek44pnSGZt5BrTOqlaAKZnnKrclDSWc1TCr7 NGzDOEWjOs04a1t3dPLaCMv5gCedmSPMTtasH2xxUyPOg4FWXADJWJsN7WU0xZyJdsLq3rZWg 3XgJnDkV6Kcel2tKWcY/UVB0wHt6cRxVUiZMg0etxgTQ3uCFx0V6OsaGjxtfGgG5yTqcYdY2A 7AMURtnb86QryDdvNzVQhkV71iV1OEYUceNHAqPLbQ/Oo21GzQN4cUe9QQMM8Kw6CBEXdduoZ brY3Bh6VnLawjj4xiiCOBb1zBU/bb1yiHm/P9tz8P/lRiRJdgq73jCkhjDmvQ5CGJwoSs0Fdi jDXTRs4YDQ9g2d7Io7vNxHiRLGksTAq5PGBrobFCKDLii8SF0TzOquEV5GWO7tB7kLqlxNiX/ oKTmI0+WPA0dggyz4gqJA5Ri4uOX9ReYTLKyIn/anLLhU67lxjSQ5JZMYzyA95QrzKaSRSo61 zrHlv10qThn+qT0wYdSSkFFQuL+NJFZbRIIGa/MOOXQOVxCf908qCVOiL57MU/IiuDyyecVJO iQVogsiUrsyQhN+Dlrwu5mdvkqD6iaHZa1djXltl49ojFCWbQEK14iE0f2/vBTB8IUxUI3cqm lmSzXmWeQUjB1r4TmdhzaWvxwLt5LJczXx8+qZ6SaLOyP8tqvQHmWOFF2rUgdHnBtiIXNFerD 0G+BMusnQDJ9l81TymgdQOqdzL42TJEeW4C8HtOjL9mJ8uW1lS6MYJU9HE0ngqzw86tbPynQB tIXnx5xET3/khZG9ckFf+SLpriwh4c8q3gRV7cZs71Vr+Y3xkRTbszYicexynqpitUo9olCdR bNq8jhaxl5oi+20VGqCJbZHRTeO0hztXzRD0bvTtPk83ufakBADXwKCIhwo6d/tbo+qN3fW2W WgJkWwTL7PMw44wil/Ihsiy5PLSlB6kmI/SFy0nmKwpJwsLY5QHf1IygPydEP7wvWg2/GR3k6 vYgeehlvJXFc40y17fPYmWOw+Q5sKZGLgxzJdZzDGXP2NbUEjqlHaFY4guuB3BLRDpKhgVSCv iIAQD42WMnNy2EM0QA8A7oYg5JuySOALFX1qVWKQ0JG6FH4SeCXQbjKoQrSjpUb+M= Message-ID-Hash: Z7WMYEB7ZMM5IYNWM3CP4ZP2U4KFRSSM X-Message-ID-Hash: Z7WMYEB7ZMM5IYNWM3CP4ZP2U4KFRSSM X-MailFrom: SRS0=igIk=3Q=welwarsky.de=ffmpeg@ffmpeg.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-ffmpeg-devel.ffmpeg.org-0; header-match-ffmpeg-devel.ffmpeg.org-1; header-match-ffmpeg-devel.ffmpeg.org-2; header-match-ffmpeg-devel.ffmpeg.org-3; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list Reply-To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH] libavfilter/vf_remap_opencl: make kernel handle subsampled chroma List-Id: FFmpeg development discussions and patches Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Matthias Welwarsky via ffmpeg-devel Cc: Matthias Welwarsky Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Archived-At: List-Archive: List-Post: This patch fixes handling YUV frames with subsampled chroma. Before this patch, the remap kernel was already aware of multiple planes, but it treated the subsampled chroma incorrectly. The maps were read based on destination coordinates and the resulting source coordinates were also not correctly transformed back to the source image. Also, for both Y and UV, the wrong vector components were used. With this patch, it is possible to have a decode->remap->encode chain entirely on the GPU. I'm using it to reframe videos from an insta360 camera. On a Lenovo T480 with the built-in UHD 620 GPU, it can achieve about 5x real-time (VAAPI decode -> remap_opencl -> VAAPI encode). Signed-off-by: Matthias Welwarsky --- libavfilter/opencl/remap.cl | 128 ++++++++++++++++++++++++---------- libavfilter/vf_remap_opencl.c | 43 ++++++++++-- 2 files changed, 130 insertions(+), 41 deletions(-) diff --git a/libavfilter/opencl/remap.cl b/libavfilter/opencl/remap.cl index fba82d134e..1a18f1afff 100644 --- a/libavfilter/opencl/remap.cl +++ b/libavfilter/opencl/remap.cl @@ -22,52 +22,108 @@ const sampler_t linear_sampler = (CLK_NORMALIZED_COORDS_FALSE | const sampler_t nearest_sampler = (CLK_NORMALIZED_COORDS_FALSE | CLK_FILTER_NEAREST); -__kernel void remap_near(__write_only image2d_t dst, - __read_only image2d_t src, - __read_only image2d_t xmapi, - __read_only image2d_t ymapi, - float4 fill_color) +__kernel void remap_near( + __write_only image2d_t dst, + __read_only image2d_t src, + __read_only image2d_t xmapi, + __read_only image2d_t ymapi, + float4 fill_color, + float4 scale, + int4 swizzle) { int2 p = (int2)(get_global_id(0), get_global_id(1)); - int2 dimi = get_image_dim(src); - float2 dimf = (float2)(dimi.x, dimi.y); - float4 val; - int2 mi; - float m; - float4 xmap = read_imagef(xmapi, nearest_sampler, p); - float4 ymap = read_imagef(ymapi, nearest_sampler, p); - float2 pos = (float2)(xmap.x, ymap.x); - pos.xy = pos.xy * 65535.f; + /* image dimensions */ + int2 src_dim = get_image_dim(src); + int2 dst_dim = get_image_dim(dst); + int2 map_dim = get_image_dim(xmapi); - mi = ((pos >= (float2)(0.f, 0.f)) * (pos < dimf) * (p <= dimi)); - m = mi.x && mi.y; - val = mix(fill_color, read_imagef(src, nearest_sampler, pos), m); + float2 src_dimf = (float2)(src_dim.x, src_dim.y); + float2 dst_dimf = (float2)(dst_dim.x, dst_dim.y); + float2 map_dimf = (float2)(map_dim.x, map_dim.y); + + /* compute map scaling to full-res */ + float2 map_scale = map_dimf / dst_dimf; + + /* scaled position to fetch from the maps */ + float2 map_p = (float2)(p.x, p.y) * map_scale; + + /* read mapping coordinates from full-res maps */ + float4 xmap = read_imagef(xmapi, nearest_sampler, map_p); + float4 ymap = read_imagef(ymapi, nearest_sampler, map_p); + float2 pos = (float2)(xmap.x, ymap.x) * 65535.f; + + pos /= map_scale; + + /* check bounds */ + int2 mi = ((pos >= (float2)(0.f,0.f)) * (pos < src_dimf)); + float m = mi.x && mi.y; + + /* read source and apply swizzle + scale */ + float4 src_val = read_imagef(src, nearest_sampler, pos); + + float tmp[4]; + vstore4(src_val, 0, tmp); + src_val = (float4)(tmp[swizzle.x] * scale.x, + tmp[swizzle.y] * scale.y, + tmp[swizzle.z] * scale.z, + tmp[swizzle.w] * scale.w); + + /* mix with fill color if out-of-bounds */ + float4 val = mix(fill_color, src_val, m); write_imagef(dst, p, val); } -__kernel void remap_linear(__write_only image2d_t dst, - __read_only image2d_t src, - __read_only image2d_t xmapi, - __read_only image2d_t ymapi, - float4 fill_color) +__kernel void remap_linear( + __write_only image2d_t dst, + __read_only image2d_t src, + __read_only image2d_t xmapi, + __read_only image2d_t ymapi, + float4 fill_color, + float4 scale, + int4 swizzle) { int2 p = (int2)(get_global_id(0), get_global_id(1)); - int2 dimi = get_image_dim(src); - float2 dimf = (float2)(dimi.x, dimi.y); - float4 val; - int2 mi; - float m; - float4 xmap = read_imagef(xmapi, nearest_sampler, p); - float4 ymap = read_imagef(ymapi, nearest_sampler, p); - float2 pos = (float2)(xmap.x, ymap.x); - pos.xy = pos.xy * 65535.f; - - mi = ((pos >= (float2)(0.f, 0.f)) * (pos < dimf) * (p <= dimi)); - m = mi.x && mi.y; - val = mix(fill_color, read_imagef(src, linear_sampler, pos), m); + + int2 src_dim = get_image_dim(src); + int2 dst_dim = get_image_dim(dst); + int2 map_dim = get_image_dim(xmapi); + + float2 src_dimf = (float2)(src_dim.x, src_dim.y); + float2 dst_dimf = (float2)(dst_dim.x, dst_dim.y); + float2 map_dimf = (float2)(map_dim.x, map_dim.y); + + float2 map_scale = map_dimf / dst_dimf; + + float2 map_p = (float2)(p.x, p.y) * map_scale; + + float4 xmap = read_imagef(xmapi, nearest_sampler, map_p); + float4 ymap = read_imagef(ymapi, nearest_sampler, map_p); + float2 pos = (float2)(xmap.x, ymap.x) * 65535.f; + + pos /= map_scale; + + int2 mi = ((pos >= (float2)(0.f,0.f)) * (pos < src_dimf)); + float m = mi.x && mi.y; + float4 src_val = read_imagef(src, linear_sampler, pos); + + float tmp[4]; + vstore4(src_val, 0, tmp); + src_val = (float4)(tmp[swizzle.x] * scale.x, + tmp[swizzle.y] * scale.y, + tmp[swizzle.z] * scale.z, + tmp[swizzle.w] * scale.w); + + float4 val = mix(fill_color, src_val, m); write_imagef(dst, p, val); } diff --git a/libavfilter/vf_remap_opencl.c b/libavfilter/vf_remap_opencl.c index bb83944b2f..f6c5211f73 100644 --- a/libavfilter/vf_remap_opencl.c +++ b/libavfilter/vf_remap_opencl.c @@ -29,10 +29,18 @@ #include "opencl_source.h" #include "video.h" +#define MAX_PLANES 4 + +typedef struct PlaneParams { + cl_float4 scale; + cl_int4 swizzle; +} PlaneParams; + typedef struct RemapOpenCLContext { OpenCLFilterContext ocf; int nb_planes; + PlaneParams plane[MAX_PLANES]; int interp; uint8_t fill_rgba[4]; cl_float4 cl_fill_color; @@ -41,6 +49,7 @@ typedef struct RemapOpenCLContext { cl_kernel kernel; cl_command_queue command_queue; FFFrameSync fs; } RemapOpenCLContext; @@ -73,10 +82,9 @@ static int remap_opencl_load(AVFilterContext *avctx, cl_int cle; const char *source = ff_source_remap_cl; const char *kernel = kernels[ctx->interp]; - const AVPixFmtDescriptor *main_desc; int err, main_planes; - const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(main_format); - int is_rgb = !!(desc->flags & AV_PIX_FMT_FLAG_RGB); + const AVPixFmtDescriptor *main_desc = av_pix_fmt_desc_get(main_format); + int is_rgb = !!(main_desc->flags & AV_PIX_FMT_FLAG_RGB); const float scale = 1.f / 255.f; uint8_t rgba_map[4]; @@ -94,8 +102,6 @@ static int remap_opencl_load(AVFilterContext *avctx, ctx->cl_fill_color.s[3] = ctx->fill_rgba[3] * scale; } - main_desc = av_pix_fmt_desc_get(main_format); - main_planes = 0; for (int i = 0; i < main_desc->nb_components; i++) main_planes = FFMAX(main_planes, @@ -103,6 +109,26 @@ static int remap_opencl_load(AVFilterContext *avctx, ctx->nb_planes = main_planes; + for (int p = 0; p < ctx->nb_planes; p++) { + PlaneParams *pp = &ctx->plane[p]; + + if (is_rgb) { + // RGB plane (single plane) + pp->scale = (cl_float4){{1.0f, 1.0f, 1.0f, 1.0f}}; + pp->swizzle = (cl_int4){{0, 1, 2, 3}}; + } else { + pp->scale = (cl_float4){{1.0f, 0.0f, 0.0f, 1.0f}}; + pp->swizzle = (cl_int4){{0, 0, 0, 3}}; + } else { + pp->scale = (cl_float4){{1.0f, 1.0f, 0.0f, 1.0f}}; + pp->swizzle = (cl_int4){{0, 1, 0, 3}}; + } err = ff_opencl_filter_load_program(avctx, &source, 1); if (err < 0) goto fail; @@ -132,6 +158,7 @@ static int remap_opencl_process_frame(FFFrameSync *fs) AVFilterContext *avctx = fs->parent; AVFilterLink *outlink = avctx->outputs[0]; RemapOpenCLContext *ctx = avctx->priv; + AVFrame *input_main, *input_xmap, *input_ymap; AVFrame *output; cl_mem mem; @@ -199,6 +226,12 @@ static int remap_opencl_process_frame(FFFrameSync *fs) CL_SET_KERNEL_ARG(ctx->kernel, kernel_arg, cl_float4, &cl_fill_color); kernel_arg++; + CL_SET_KERNEL_ARG(ctx->kernel, kernel_arg, cl_float4, &(ctx- >plane[plane].scale)); + kernel_arg++; + + CL_SET_KERNEL_ARG(ctx->kernel, kernel_arg, cl_int4, &(ctx- >plane[plane].swizzle)); + kernel_arg++; + err = ff_opencl_filter_work_size_from_image(avctx, global_work, output, plane, 0); _______________________________________________ ffmpeg-devel mailing list -- ffmpeg-devel@ffmpeg.org To unsubscribe send an email to ffmpeg-devel-leave@ffmpeg.org