Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb
@ 2024-07-23 12:46 Ramiro Polla
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 2/4] swscale/yuv2rgb: add yuv42{0, 2}p -> gbrp unscaled colorspace converters Ramiro Polla
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Ramiro Polla @ 2024-07-23 12:46 UTC (permalink / raw)
  To: ffmpeg-devel

This will be used in the upcoming yuv42{0,2}p -> gbrp unscaled
colorspace converters.
---
 libswscale/yuv2rgb.c | 279 ++++++++++++++++++++++---------------------
 1 file changed, 142 insertions(+), 137 deletions(-)

diff --git a/libswscale/yuv2rgb.c b/libswscale/yuv2rgb.c
index cfbc54abd0..c283d6d1bd 100644
--- a/libswscale/yuv2rgb.c
+++ b/libswscale/yuv2rgb.c
@@ -72,59 +72,59 @@ const int *sws_getCoefficients(int colorspace)
     g = (void *)(c->table_gU[U+YUVRGB_TABLE_HEADROOM] + c->table_gV[V+YUVRGB_TABLE_HEADROOM]);  \
     b = (void *)c->table_bU[U+YUVRGB_TABLE_HEADROOM];
 
-#define PUTRGB(dst, src, asrc, i, abase)            \
+#define PUTRGB(l, src, asrc, i, abase)              \
     Y              = src[2 * i];                    \
-    dst[2 * i]     = r[Y] + g[Y] + b[Y];            \
+    dst_p[0][l][2 * i]     = r[Y] + g[Y] + b[Y];    \
     Y              = src[2 * i + 1];                \
-    dst[2 * i + 1] = r[Y] + g[Y] + b[Y];
+    dst_p[0][l][2 * i + 1] = r[Y] + g[Y] + b[Y];
 
-#define PUTRGB24(dst, src, asrc, i, abase)          \
+#define PUTRGB24(l, src, asrc, i, abase)            \
     Y              = src[2 * i];                    \
-    dst[6 * i + 0] = r[Y];                          \
-    dst[6 * i + 1] = g[Y];                          \
-    dst[6 * i + 2] = b[Y];                          \
+    dst_p[0][l][6 * i + 0] = r[Y];                  \
+    dst_p[0][l][6 * i + 1] = g[Y];                  \
+    dst_p[0][l][6 * i + 2] = b[Y];                  \
     Y              = src[2 * i + 1];                \
-    dst[6 * i + 3] = r[Y];                          \
-    dst[6 * i + 4] = g[Y];                          \
-    dst[6 * i + 5] = b[Y];
+    dst_p[0][l][6 * i + 3] = r[Y];                  \
+    dst_p[0][l][6 * i + 4] = g[Y];                  \
+    dst_p[0][l][6 * i + 5] = b[Y];
 
-#define PUTBGR24(dst, src, asrc, i, abase)          \
+#define PUTBGR24(l, src, asrc, i, abase)            \
     Y              = src[2 * i];                    \
-    dst[6 * i + 0] = b[Y];                          \
-    dst[6 * i + 1] = g[Y];                          \
-    dst[6 * i + 2] = r[Y];                          \
+    dst_p[0][l][6 * i + 0] = b[Y];                  \
+    dst_p[0][l][6 * i + 1] = g[Y];                  \
+    dst_p[0][l][6 * i + 2] = r[Y];                  \
     Y              = src[2 * i + 1];                \
-    dst[6 * i + 3] = b[Y];                          \
-    dst[6 * i + 4] = g[Y];                          \
-    dst[6 * i + 5] = r[Y];
+    dst_p[0][l][6 * i + 3] = b[Y];                  \
+    dst_p[0][l][6 * i + 4] = g[Y];                  \
+    dst_p[0][l][6 * i + 5] = r[Y];
 
-#define PUTRGBA(dst, ysrc, asrc, i, abase)                              \
+#define PUTRGBA(l, ysrc, asrc, i, abase)                                \
     Y              = ysrc[2 * i];                                       \
-    dst[2 * i]     = r[Y] + g[Y] + b[Y] + ((uint32_t)(asrc[2 * i])     << abase);   \
+    dst_p[0][l][2 * i]     = r[Y] + g[Y] + b[Y] + ((uint32_t)(asrc[2 * i])     << abase);   \
     Y              = ysrc[2 * i + 1];                                   \
-    dst[2 * i + 1] = r[Y] + g[Y] + b[Y] + ((uint32_t)(asrc[2 * i + 1]) << abase);
+    dst_p[0][l][2 * i + 1] = r[Y] + g[Y] + b[Y] + ((uint32_t)(asrc[2 * i + 1]) << abase);
 
-#define PUTRGB48(dst, src, asrc, i, abase)          \
+#define PUTRGB48(l, src, asrc, i, abase)            \
     Y                = src[ 2 * i];                 \
-    dst[12 * i +  0] = dst[12 * i +  1] = r[Y];     \
-    dst[12 * i +  2] = dst[12 * i +  3] = g[Y];     \
-    dst[12 * i +  4] = dst[12 * i +  5] = b[Y];     \
+    dst_p[0][l][12 * i +  0] = dst_p[0][l][12 * i +  1] = r[Y]; \
+    dst_p[0][l][12 * i +  2] = dst_p[0][l][12 * i +  3] = g[Y]; \
+    dst_p[0][l][12 * i +  4] = dst_p[0][l][12 * i +  5] = b[Y]; \
     Y                = src[ 2 * i + 1];             \
-    dst[12 * i +  6] = dst[12 * i +  7] = r[Y];     \
-    dst[12 * i +  8] = dst[12 * i +  9] = g[Y];     \
-    dst[12 * i + 10] = dst[12 * i + 11] = b[Y];
+    dst_p[0][l][12 * i +  6] = dst_p[0][l][12 * i +  7] = r[Y]; \
+    dst_p[0][l][12 * i +  8] = dst_p[0][l][12 * i +  9] = g[Y]; \
+    dst_p[0][l][12 * i + 10] = dst_p[0][l][12 * i + 11] = b[Y];
 
-#define PUTBGR48(dst, src, asrc, i, abase)          \
+#define PUTBGR48(l, src, asrc, i, abase)            \
     Y                = src[2 * i];                  \
-    dst[12 * i +  0] = dst[12 * i +  1] = b[Y];     \
-    dst[12 * i +  2] = dst[12 * i +  3] = g[Y];     \
-    dst[12 * i +  4] = dst[12 * i +  5] = r[Y];     \
+    dst_p[0][l][12 * i +  0] = dst_p[0][l][12 * i +  1] = b[Y]; \
+    dst_p[0][l][12 * i +  2] = dst_p[0][l][12 * i +  3] = g[Y]; \
+    dst_p[0][l][12 * i +  4] = dst_p[0][l][12 * i +  5] = r[Y]; \
     Y                = src[2  * i +  1];            \
-    dst[12 * i +  6] = dst[12 * i +  7] = b[Y];     \
-    dst[12 * i +  8] = dst[12 * i +  9] = g[Y];     \
-    dst[12 * i + 10] = dst[12 * i + 11] = r[Y];
+    dst_p[0][l][12 * i +  6] = dst_p[0][l][12 * i +  7] = b[Y]; \
+    dst_p[0][l][12 * i +  8] = dst_p[0][l][12 * i +  9] = g[Y]; \
+    dst_p[0][l][12 * i + 10] = dst_p[0][l][12 * i + 11] = r[Y];
 
-#define YUV2RGBFUNC(func_name, dst_type, alpha, yuv422)                     \
+#define YUV2RGBFUNC(func_name, dst_type, alpha, yuv422, nb_dst_planes)      \
     static int func_name(SwsContext *c, const uint8_t *src[],               \
                          int srcStride[], int srcSliceY, int srcSliceH,     \
                          uint8_t *dst[], int dstStride[])                   \
@@ -133,10 +133,7 @@ const int *sws_getCoefficients(int colorspace)
                                                                             \
         for (y = 0; y < srcSliceH; y += 2) {                                \
             int yd = y + srcSliceY;                                         \
-            dst_type *dst_1 =                                               \
-                (dst_type *)(dst[0] + (yd)     * dstStride[0]);             \
-            dst_type *dst_2 =                                               \
-                (dst_type *)(dst[0] + (yd + 1) * dstStride[0]);             \
+            dst_type *dst_p[nb_dst_planes][2];                              \
             dst_type av_unused *r, *g, *b;                                  \
             const uint8_t *py_1 = src[0] +  y       * srcStride[0];         \
             const uint8_t *py_2 = py_1   +            srcStride[0];         \
@@ -145,6 +142,12 @@ const int *sws_getCoefficients(int colorspace)
             const uint8_t av_unused *pu_2, *pv_2;                           \
             const uint8_t av_unused *pa_1, *pa_2;                           \
             unsigned int h_size = c->dstW >> 3;                             \
+            for (int p = 0; p < nb_dst_planes; p++) {                       \
+                dst_p[p][0] =                                               \
+                    (dst_type *)(dst[p] + (yd)     * dstStride[p]);         \
+                dst_p[p][1] =                                               \
+                    (dst_type *)(dst[p] + (yd + 1) * dstStride[p]);         \
+            }                                                               \
             if (yuv422) {                                                   \
                 pu_2 = pu_1 + srcStride[1];                                 \
                 pv_2 = pv_1 + srcStride[2];                                 \
@@ -156,7 +159,7 @@ const int *sws_getCoefficients(int colorspace)
             while (h_size--) {                                              \
                 int av_unused U, V, Y;                                      \
 
-#define ENDYUV2RGBLINE(dst_delta, ss, alpha, yuv422) \
+#define ENDYUV2RGBLINE(dst_delta, ss, alpha, yuv422, nb_dst_planes) \
     pu_1  += 4 >> ss;                               \
     pv_1  += 4 >> ss;                               \
     if (yuv422) {                                   \
@@ -169,8 +172,10 @@ const int *sws_getCoefficients(int colorspace)
         pa_1 += 8 >> ss;                            \
         pa_2 += 8 >> ss;                            \
     }                                               \
-    dst_1 += dst_delta >> ss;                       \
-    dst_2 += dst_delta >> ss;                       \
+    for (int p = 0; p < nb_dst_planes; p++) {       \
+        dst_p[p][0] += dst_delta >> ss;             \
+        dst_p[p][1] += dst_delta >> ss;             \
+    }                                               \
     }                                               \
     if (c->dstW & (4 >> ss)) {                      \
         int av_unused Y, U, V;                      \
@@ -181,168 +186,168 @@ const int *sws_getCoefficients(int colorspace)
         return srcSliceH;                           \
     }
 
-#define YUV420FUNC(func_name, dst_type, alpha, abase, PUTFUNC, dst_delta) \
-    YUV2RGBFUNC(func_name, dst_type, alpha, 0)                          \
+#define YUV420FUNC(func_name, dst_type, alpha, abase, PUTFUNC, dst_delta, nb_dst_planes) \
+    YUV2RGBFUNC(func_name, dst_type, alpha, 0, nb_dst_planes)           \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 0, abase);                           \
-        PUTFUNC(dst_2, py_2, pa_2, 0, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 0, abase);                               \
+        PUTFUNC(1, py_2, pa_2, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 1, abase);                           \
-        PUTFUNC(dst_1, py_1, pa_1, 1, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 1, abase);                               \
+        PUTFUNC(0, py_1, pa_1, 1, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 2);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 2, abase);                           \
-        PUTFUNC(dst_2, py_2, pa_2, 2, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 2, abase);                               \
+        PUTFUNC(1, py_2, pa_2, 2, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 3);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 3, abase);                           \
-        PUTFUNC(dst_1, py_1, pa_1, 3, abase);                           \
-    ENDYUV2RGBLINE(dst_delta, 0, alpha, 0)                              \
+        PUTFUNC(1, py_2, pa_2, 3, abase);                               \
+        PUTFUNC(0, py_1, pa_1, 3, abase);                               \
+    ENDYUV2RGBLINE(dst_delta, 0, alpha, 0, nb_dst_planes)               \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 0, abase);                           \
-        PUTFUNC(dst_2, py_2, pa_2, 0, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 0, abase);                               \
+        PUTFUNC(1, py_2, pa_2, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 1, abase);                           \
-        PUTFUNC(dst_1, py_1, pa_1, 1, abase);                           \
-    ENDYUV2RGBLINE(dst_delta, 1, alpha, 0)                              \
+        PUTFUNC(1, py_2, pa_2, 1, abase);                               \
+        PUTFUNC(0, py_1, pa_1, 1, abase);                               \
+    ENDYUV2RGBLINE(dst_delta, 1, alpha, 0, nb_dst_planes)               \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 0, abase);                           \
-        PUTFUNC(dst_2, py_2, pa_2, 0, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 0, abase);                               \
+        PUTFUNC(1, py_2, pa_2, 0, abase);                               \
     ENDYUV2RGBFUNC()
 
-#define YUV422FUNC(func_name, dst_type, alpha, abase, PUTFUNC, dst_delta) \
-    YUV2RGBFUNC(func_name, dst_type, alpha, 1)                          \
+#define YUV422FUNC(func_name, dst_type, alpha, abase, PUTFUNC, dst_delta, nb_dst_planes) \
+    YUV2RGBFUNC(func_name, dst_type, alpha, 1, nb_dst_planes)           \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 0, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 0);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 0, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 1);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 1, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 1, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 1, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 1, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 2);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 2, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 2, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 2);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 2, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 2, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 3);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 3, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 3, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 3);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 3, abase);                           \
-    ENDYUV2RGBLINE(dst_delta, 0, alpha, 1)                              \
+        PUTFUNC(0, py_1, pa_1, 3, abase);                               \
+    ENDYUV2RGBLINE(dst_delta, 0, alpha, 1, nb_dst_planes)               \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 0, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 0);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 0, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 1);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 1, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 1, abase);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 1, abase);                           \
-    ENDYUV2RGBLINE(dst_delta, 1, alpha, 1)                              \
+        PUTFUNC(0, py_1, pa_1, 1, abase);                               \
+    ENDYUV2RGBLINE(dst_delta, 1, alpha, 1, nb_dst_planes)               \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, pa_1, 0, abase);                           \
+        PUTFUNC(0, py_1, pa_1, 0, abase);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 0);                                      \
-        PUTFUNC(dst_2, py_2, pa_2, 0, abase);                           \
+        PUTFUNC(1, py_2, pa_2, 0, abase);                               \
     ENDYUV2RGBFUNC()
 
 #define YUV420FUNC_DITHER(func_name, dst_type, LOADDITHER, PUTFUNC, dst_delta) \
-    YUV2RGBFUNC(func_name, dst_type, 0, 0)                              \
+    YUV2RGBFUNC(func_name, dst_type, 0, 0, 1)                           \
         LOADDITHER                                                      \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, 0, 0);                                     \
-        PUTFUNC(dst_2, py_2, 0, 0 + 8);                                 \
+        PUTFUNC(dst_p[0][0], py_1, 0, 0);                               \
+        PUTFUNC(dst_p[0][1], py_2, 0, 0 + 8);                           \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_2, py_2, 1, 2 + 8);                                 \
-        PUTFUNC(dst_1, py_1, 1, 2);                                     \
+        PUTFUNC(dst_p[0][1], py_2, 1, 2 + 8);                           \
+        PUTFUNC(dst_p[0][0], py_1, 1, 2);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 2);                                      \
-        PUTFUNC(dst_1, py_1, 2, 4);                                     \
-        PUTFUNC(dst_2, py_2, 2, 4 + 8);                                 \
+        PUTFUNC(dst_p[0][0], py_1, 2, 4);                               \
+        PUTFUNC(dst_p[0][1], py_2, 2, 4 + 8);                           \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 3);                                      \
-        PUTFUNC(dst_2, py_2, 3, 6 + 8);                                 \
-        PUTFUNC(dst_1, py_1, 3, 6);                                     \
-    ENDYUV2RGBLINE(dst_delta, 0, 0, 0)                                  \
+        PUTFUNC(dst_p[0][1], py_2, 3, 6 + 8);                           \
+        PUTFUNC(dst_p[0][0], py_1, 3, 6);                               \
+    ENDYUV2RGBLINE(dst_delta, 0, 0, 0, 1)                               \
         LOADDITHER                                                      \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, 0, 0);                                     \
-        PUTFUNC(dst_2, py_2, 0, 0 + 8);                                 \
+        PUTFUNC(dst_p[0][0], py_1, 0, 0);                               \
+        PUTFUNC(dst_p[0][1], py_2, 0, 0 + 8);                           \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_2, py_2, 1, 2 + 8);                                 \
-        PUTFUNC(dst_1, py_1, 1, 2);                                     \
-    ENDYUV2RGBLINE(dst_delta, 1, 0, 0)                                  \
+        PUTFUNC(dst_p[0][1], py_2, 1, 2 + 8);                           \
+        PUTFUNC(dst_p[0][0], py_1, 1, 2);                               \
+    ENDYUV2RGBLINE(dst_delta, 1, 0, 0, 1)                               \
         LOADDITHER                                                      \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, 0, 0);                                     \
-        PUTFUNC(dst_2, py_2, 0, 0 + 8);                                 \
+        PUTFUNC(dst_p[0][0], py_1, 0, 0);                                     \
+        PUTFUNC(dst_p[0][1], py_2, 0, 0 + 8);                                 \
     ENDYUV2RGBFUNC()
 
 #define YUV422FUNC_DITHER(func_name, dst_type, LOADDITHER, PUTFUNC, dst_delta) \
-    YUV2RGBFUNC(func_name, dst_type, 0, 1)                              \
+    YUV2RGBFUNC(func_name, dst_type, 0, 1, 1)                           \
         LOADDITHER                                                      \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, 0, 0);                                     \
+        PUTFUNC(dst_p[0][0], py_1, 0, 0);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 0);                                      \
-        PUTFUNC(dst_2, py_2, 0, 0 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 0, 0 + 8);                           \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 1);                                      \
-        PUTFUNC(dst_2, py_2, 1, 2 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 1, 2 + 8);                           \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_1, py_1, 1, 2);                                     \
+        PUTFUNC(dst_p[0][0], py_1, 1, 2);                               \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 2);                                      \
-        PUTFUNC(dst_1, py_1, 2, 4);                                     \
+        PUTFUNC(dst_p[0][0], py_1, 2, 4);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 2);                                      \
-        PUTFUNC(dst_2, py_2, 2, 4 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 2, 4 + 8);                           \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 3);                                      \
-        PUTFUNC(dst_2, py_2, 3, 6 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 3, 6 + 8);                           \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 3);                                      \
-        PUTFUNC(dst_1, py_1, 3, 6);                                     \
-    ENDYUV2RGBLINE(dst_delta, 0, 0, 1)                                  \
+        PUTFUNC(dst_p[0][0], py_1, 3, 6);                               \
+    ENDYUV2RGBLINE(dst_delta, 0, 0, 1, 1)                               \
         LOADDITHER                                                      \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, 0, 0);                                     \
+        PUTFUNC(dst_p[0][0], py_1, 0, 0);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 0);                                      \
-        PUTFUNC(dst_2, py_2, 0, 0 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 0, 0 + 8);                           \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 1);                                      \
-        PUTFUNC(dst_2, py_2, 1, 2 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 1, 2 + 8);                           \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 1);                                      \
-        PUTFUNC(dst_1, py_1, 1, 2);                                     \
-    ENDYUV2RGBLINE(dst_delta, 1, 0, 1)                                  \
+        PUTFUNC(dst_p[0][0], py_1, 1, 2);                               \
+    ENDYUV2RGBLINE(dst_delta, 1, 0, 1, 1)                               \
         LOADDITHER                                                      \
                                                                         \
         LOADCHROMA(pu_1, pv_1, 0);                                      \
-        PUTFUNC(dst_1, py_1, 0, 0);                                     \
+        PUTFUNC(dst_p[0][0], py_1, 0, 0);                               \
                                                                         \
         LOADCHROMA(pu_2, pv_2, 0);                                      \
-        PUTFUNC(dst_2, py_2, 0, 0 + 8);                                 \
+        PUTFUNC(dst_p[0][1], py_2, 0, 0 + 8);                           \
     ENDYUV2RGBFUNC()
 
 #define LOADDITHER16                                    \
@@ -431,7 +436,7 @@ const int *sws_getCoefficients(int colorspace)
                      g[Y +  d64[1 + o]] +           \
                      b[Y + d128[1 + o]];
 
-YUV2RGBFUNC(yuv2rgb_c_1_ordered_dither, uint8_t, 0, 0)
+YUV2RGBFUNC(yuv2rgb_c_1_ordered_dither, uint8_t, 0, 0, 1)
     const uint8_t *d128 = ff_dither_8x8_220[yd & 7];
     char out_1 = 0, out_2 = 0;
     g = c->table_gU[128 + YUVRGB_TABLE_HEADROOM] + c->table_gV[128 + YUVRGB_TABLE_HEADROOM];
@@ -454,13 +459,13 @@ YUV2RGBFUNC(yuv2rgb_c_1_ordered_dither, uint8_t, 0, 0)
     PUTRGB1(out_2, py_2, 3, 6 + 8);
     PUTRGB1(out_1, py_1, 3, 6);
 
-    dst_1[0] = out_1;
-    dst_2[0] = out_2;
+    dst_p[0][0][0] = out_1;
+    dst_p[0][1][0] = out_2;
 
     py_1  += 8;
     py_2  += 8;
-    dst_1 += 1;
-    dst_2 += 1;
+    dst_p[0][0] += 1;
+    dst_p[0][1] += 1;
     }
     if (c->dstW & 7) {
         int av_unused Y, U, V;
@@ -489,23 +494,23 @@ YUV2RGBFUNC(yuv2rgb_c_1_ordered_dither, uint8_t, 0, 0)
     PUTRGB1_OR00(out_2, py_2, 3, 6 + 8);
     PUTRGB1_OR00(out_1, py_1, 3, 6);
 
-    dst_1[0] = out_1;
-    dst_2[0] = out_2;
+    dst_p[0][0][0] = out_1;
+    dst_p[0][1][0] = out_2;
 ENDYUV2RGBFUNC()
 
 // YUV420
-YUV420FUNC(yuv2rgb_c_48,     uint8_t,  0,  0, PUTRGB48, 48)
-YUV420FUNC(yuv2rgb_c_bgr48,  uint8_t,  0,  0, PUTBGR48, 48)
-YUV420FUNC(yuv2rgb_c_32,     uint32_t, 0,  0, PUTRGB,    8)
+YUV420FUNC(yuv2rgb_c_48,     uint8_t,  0,  0, PUTRGB48, 48, 1)
+YUV420FUNC(yuv2rgb_c_bgr48,  uint8_t,  0,  0, PUTBGR48, 48, 1)
+YUV420FUNC(yuv2rgb_c_32,     uint32_t, 0,  0, PUTRGB,    8, 1)
 #if HAVE_BIGENDIAN
-YUV420FUNC(yuva2argb_c,      uint32_t, 1, 24, PUTRGBA,   8)
-YUV420FUNC(yuva2rgba_c,      uint32_t, 1,  0, PUTRGBA,   8)
+YUV420FUNC(yuva2argb_c,      uint32_t, 1, 24, PUTRGBA,   8, 1)
+YUV420FUNC(yuva2rgba_c,      uint32_t, 1,  0, PUTRGBA,   8, 1)
 #else
-YUV420FUNC(yuva2rgba_c,      uint32_t, 1, 24, PUTRGBA,   8)
-YUV420FUNC(yuva2argb_c,      uint32_t, 1,  0, PUTRGBA,   8)
+YUV420FUNC(yuva2rgba_c,      uint32_t, 1, 24, PUTRGBA,   8, 1)
+YUV420FUNC(yuva2argb_c,      uint32_t, 1,  0, PUTRGBA,   8, 1)
 #endif
-YUV420FUNC(yuv2rgb_c_24_rgb, uint8_t,  0,  0, PUTRGB24, 24)
-YUV420FUNC(yuv2rgb_c_24_bgr, uint8_t,  0,  0, PUTBGR24, 24)
+YUV420FUNC(yuv2rgb_c_24_rgb, uint8_t,  0,  0, PUTRGB24, 24, 1)
+YUV420FUNC(yuv2rgb_c_24_bgr, uint8_t,  0,  0, PUTBGR24, 24, 1)
 YUV420FUNC_DITHER(yuv2rgb_c_16_ordered_dither, uint16_t, LOADDITHER16,  PUTRGB16,  8)
 YUV420FUNC_DITHER(yuv2rgb_c_15_ordered_dither, uint16_t, LOADDITHER15,  PUTRGB15,  8)
 YUV420FUNC_DITHER(yuv2rgb_c_12_ordered_dither, uint16_t, LOADDITHER12,  PUTRGB12,  8)
@@ -514,18 +519,18 @@ YUV420FUNC_DITHER(yuv2rgb_c_4_ordered_dither,  uint8_t,  LOADDITHER4D,  PUTRGB4D
 YUV420FUNC_DITHER(yuv2rgb_c_4b_ordered_dither, uint8_t,  LOADDITHER4DB, PUTRGB4DB, 8)
 
 // YUV422
-YUV422FUNC(yuv422p_rgb48_c,  uint8_t,  0,  0, PUTRGB48, 48)
-YUV422FUNC(yuv422p_bgr48_c,  uint8_t,  0,  0, PUTBGR48, 48)
-YUV422FUNC(yuv422p_rgb32_c,  uint32_t, 0,  0, PUTRGB,    8)
+YUV422FUNC(yuv422p_rgb48_c,  uint8_t,  0,  0, PUTRGB48, 48, 1)
+YUV422FUNC(yuv422p_bgr48_c,  uint8_t,  0,  0, PUTBGR48, 48, 1)
+YUV422FUNC(yuv422p_rgb32_c,  uint32_t, 0,  0, PUTRGB,    8, 1)
 #if HAVE_BIGENDIAN
-YUV422FUNC(yuva422p_argb_c,  uint32_t, 1, 24, PUTRGBA,   8)
-YUV422FUNC(yuva422p_rgba_c,  uint32_t, 1,  0, PUTRGBA,   8)
+YUV422FUNC(yuva422p_argb_c,  uint32_t, 1, 24, PUTRGBA,   8, 1)
+YUV422FUNC(yuva422p_rgba_c,  uint32_t, 1,  0, PUTRGBA,   8, 1)
 #else
-YUV422FUNC(yuva422p_rgba_c,  uint32_t, 1, 24, PUTRGBA,   8)
-YUV422FUNC(yuva422p_argb_c,  uint32_t, 1,  0, PUTRGBA,   8)
+YUV422FUNC(yuva422p_rgba_c,  uint32_t, 1, 24, PUTRGBA,   8, 1)
+YUV422FUNC(yuva422p_argb_c,  uint32_t, 1,  0, PUTRGBA,   8, 1)
 #endif
-YUV422FUNC(yuv422p_rgb24_c,  uint8_t,  0,  0, PUTRGB24, 24)
-YUV422FUNC(yuv422p_bgr24_c,  uint8_t,  0,  0, PUTBGR24, 24)
+YUV422FUNC(yuv422p_rgb24_c,  uint8_t,  0,  0, PUTRGB24, 24, 1)
+YUV422FUNC(yuv422p_bgr24_c,  uint8_t,  0,  0, PUTBGR24, 24, 1)
 YUV422FUNC_DITHER(yuv422p_bgr16,     uint16_t, LOADDITHER16,  PUTRGB16,  8)
 YUV422FUNC_DITHER(yuv422p_bgr15,     uint16_t, LOADDITHER15,  PUTRGB15,  8)
 YUV422FUNC_DITHER(yuv422p_bgr12,     uint16_t, LOADDITHER12,  PUTRGB12,  8)
-- 
2.30.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [FFmpeg-devel] [PATCH 2/4] swscale/yuv2rgb: add yuv42{0, 2}p -> gbrp unscaled colorspace converters
  2024-07-23 12:46 [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
@ 2024-07-23 12:46 ` Ramiro Polla
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 3/4] swscale/x86/yuv2rgb: add ssse3 " Ramiro Polla
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Ramiro Polla @ 2024-07-23 12:46 UTC (permalink / raw)
  To: ffmpeg-devel

---
 libswscale/yuv2rgb.c        | 16 ++++++++++
 tests/checkasm/sw_yuv2rgb.c | 60 +++++++++++++++++++++++++++----------
 2 files changed, 60 insertions(+), 16 deletions(-)

diff --git a/libswscale/yuv2rgb.c b/libswscale/yuv2rgb.c
index c283d6d1bd..3a41c4eba6 100644
--- a/libswscale/yuv2rgb.c
+++ b/libswscale/yuv2rgb.c
@@ -124,6 +124,16 @@ const int *sws_getCoefficients(int colorspace)
     dst_p[0][l][12 * i +  8] = dst_p[0][l][12 * i +  9] = g[Y]; \
     dst_p[0][l][12 * i + 10] = dst_p[0][l][12 * i + 11] = r[Y];
 
+#define PUTGBRP(l, src, asrc, i, abase)             \
+    Y                      = src[2 * i];            \
+    dst_p[0][l][2 * i + 0] = g[Y];                  \
+    dst_p[1][l][2 * i + 0] = b[Y];                  \
+    dst_p[2][l][2 * i + 0] = r[Y];                  \
+    Y                      = src[2 * i + 1];        \
+    dst_p[0][l][2 * i + 1] = g[Y];                  \
+    dst_p[1][l][2 * i + 1] = b[Y];                  \
+    dst_p[2][l][2 * i + 1] = r[Y];
+
 #define YUV2RGBFUNC(func_name, dst_type, alpha, yuv422, nb_dst_planes)      \
     static int func_name(SwsContext *c, const uint8_t *src[],               \
                          int srcStride[], int srcSliceY, int srcSliceH,     \
@@ -511,6 +521,7 @@ YUV420FUNC(yuva2argb_c,      uint32_t, 1,  0, PUTRGBA,   8, 1)
 #endif
 YUV420FUNC(yuv2rgb_c_24_rgb, uint8_t,  0,  0, PUTRGB24, 24, 1)
 YUV420FUNC(yuv2rgb_c_24_bgr, uint8_t,  0,  0, PUTBGR24, 24, 1)
+YUV420FUNC(yuv420p_gbrp_c,   uint8_t,  0,  0, PUTGBRP,   8, 4)
 YUV420FUNC_DITHER(yuv2rgb_c_16_ordered_dither, uint16_t, LOADDITHER16,  PUTRGB16,  8)
 YUV420FUNC_DITHER(yuv2rgb_c_15_ordered_dither, uint16_t, LOADDITHER15,  PUTRGB15,  8)
 YUV420FUNC_DITHER(yuv2rgb_c_12_ordered_dither, uint16_t, LOADDITHER12,  PUTRGB12,  8)
@@ -531,6 +542,7 @@ YUV422FUNC(yuva422p_argb_c,  uint32_t, 1,  0, PUTRGBA,   8, 1)
 #endif
 YUV422FUNC(yuv422p_rgb24_c,  uint8_t,  0,  0, PUTRGB24, 24, 1)
 YUV422FUNC(yuv422p_bgr24_c,  uint8_t,  0,  0, PUTBGR24, 24, 1)
+YUV422FUNC(yuv422p_gbrp_c,   uint8_t,  0,  0, PUTGBRP,   8, 4)
 YUV422FUNC_DITHER(yuv422p_bgr16,     uint16_t, LOADDITHER16,  PUTRGB16,  8)
 YUV422FUNC_DITHER(yuv422p_bgr15,     uint16_t, LOADDITHER15,  PUTRGB15,  8)
 YUV422FUNC_DITHER(yuv422p_bgr12,     uint16_t, LOADDITHER12,  PUTRGB12,  8)
@@ -596,6 +608,8 @@ SwsFunc ff_yuv2rgb_get_func_ptr(SwsContext *c)
             return yuv422p_bgr4_byte;
         case AV_PIX_FMT_MONOBLACK:
             return yuv2rgb_c_1_ordered_dither;
+        case AV_PIX_FMT_GBRP:
+            return yuv422p_gbrp_c;
         }
     } else {
         switch (c->dstFormat) {
@@ -636,6 +650,8 @@ SwsFunc ff_yuv2rgb_get_func_ptr(SwsContext *c)
             return yuv2rgb_c_4b_ordered_dither;
         case AV_PIX_FMT_MONOBLACK:
             return yuv2rgb_c_1_ordered_dither;
+        case AV_PIX_FMT_GBRP:
+            return yuv420p_gbrp_c;
         }
     }
     return NULL;
diff --git a/tests/checkasm/sw_yuv2rgb.c b/tests/checkasm/sw_yuv2rgb.c
index 02ed9a74d5..5125f83968 100644
--- a/tests/checkasm/sw_yuv2rgb.c
+++ b/tests/checkasm/sw_yuv2rgb.c
@@ -58,6 +58,7 @@ static const int dst_fmts[] = {
 //     AV_PIX_FMT_RGB4_BYTE,
 //     AV_PIX_FMT_BGR4_BYTE,
 //     AV_PIX_FMT_MONOBLACK,
+    AV_PIX_FMT_GBRP,
 };
 
 static int cmp_off_by_n(const uint8_t *ref, const uint8_t *test, size_t n, int accuracy)
@@ -116,13 +117,25 @@ static void check_yuv2rgb(int src_pix_fmt)
     LOCAL_ALIGNED_8(uint8_t, src_a, [MAX_LINE_SIZE * 2]);
     const uint8_t *src[4] = { src_y, src_u, src_v, src_a };
 
-    LOCAL_ALIGNED_8(uint8_t, dst0_, [2 * MAX_LINE_SIZE * 6]);
-    uint8_t *dst0[4] = { dst0_ };
-    uint8_t *lines0[2] = { dst0_, dst0_ + MAX_LINE_SIZE * 6 };
-
-    LOCAL_ALIGNED_8(uint8_t, dst1_, [2 * MAX_LINE_SIZE * 6]);
-    uint8_t *dst1[4] = { dst1_ };
-    uint8_t *lines1[2] = { dst1_, dst1_ + MAX_LINE_SIZE * 6 };
+    LOCAL_ALIGNED_8(uint8_t, dst0_0, [2 * MAX_LINE_SIZE * 6]);
+    LOCAL_ALIGNED_8(uint8_t, dst0_1, [2 * MAX_LINE_SIZE]);
+    LOCAL_ALIGNED_8(uint8_t, dst0_2, [2 * MAX_LINE_SIZE]);
+    uint8_t *dst0[4] = { dst0_0, dst0_1, dst0_2 };
+    uint8_t *lines0[4][2] = {
+        { dst0_0, dst0_0 + MAX_LINE_SIZE * 6 },
+        { dst0_1, dst0_1 + MAX_LINE_SIZE },
+        { dst0_2, dst0_2 + MAX_LINE_SIZE }
+    };
+
+    LOCAL_ALIGNED_8(uint8_t, dst1_0, [2 * MAX_LINE_SIZE * 6]);
+    LOCAL_ALIGNED_8(uint8_t, dst1_1, [2 * MAX_LINE_SIZE]);
+    LOCAL_ALIGNED_8(uint8_t, dst1_2, [2 * MAX_LINE_SIZE]);
+    uint8_t *dst1[4] = { dst1_0, dst1_1, dst1_2 };
+    uint8_t *lines1[4][2] = {
+        { dst1_0, dst1_0 + MAX_LINE_SIZE * 6 },
+        { dst1_1, dst1_1 + MAX_LINE_SIZE },
+        { dst1_2, dst1_2 + MAX_LINE_SIZE }
+    };
 
     randomize_buffers(src_y, MAX_LINE_SIZE * 2);
     randomize_buffers(src_u, MAX_LINE_SIZE);
@@ -145,7 +158,11 @@ static void check_yuv2rgb(int src_pix_fmt)
                 width >> src_desc->log2_chroma_w,
                 width,
             };
-            int dstStride[4] = { MAX_LINE_SIZE * 6 };
+            int dstStride[4] = {
+                MAX_LINE_SIZE * 6,
+                MAX_LINE_SIZE,
+                MAX_LINE_SIZE,
+            };
 
             // override log level to prevent spamming of the message
             // "No accelerated colorspace conversion found from %s to %s"
@@ -159,8 +176,14 @@ static void check_yuv2rgb(int src_pix_fmt)
                 fail();
 
             if (check_func(ctx->convert_unscaled, "%s_%s_%d", src_desc->name, dst_desc->name, width)) {
-                memset(dst0_, 0xFF, 2 * MAX_LINE_SIZE * 6);
-                memset(dst1_, 0xFF, 2 * MAX_LINE_SIZE * 6);
+                memset(dst0_0, 0xFF, 2 * MAX_LINE_SIZE * 6);
+                memset(dst1_0, 0xFF, 2 * MAX_LINE_SIZE * 6);
+                if (dst_pix_fmt == AV_PIX_FMT_GBRP) {
+                    memset(dst0_1, 0xFF, MAX_LINE_SIZE);
+                    memset(dst0_2, 0xFF, MAX_LINE_SIZE);
+                    memset(dst1_1, 0xFF, MAX_LINE_SIZE);
+                    memset(dst1_2, 0xFF, MAX_LINE_SIZE);
+                }
 
                 call_ref(ctx, src, srcStride, srcSliceY,
                          srcSliceH, dst0, dstStride);
@@ -173,19 +196,24 @@ static void check_yuv2rgb(int src_pix_fmt)
                     dst_pix_fmt == AV_PIX_FMT_BGRA  ||
                     dst_pix_fmt == AV_PIX_FMT_RGB24 ||
                     dst_pix_fmt == AV_PIX_FMT_BGR24) {
-                    if (cmp_off_by_n(lines0[0], lines1[0], width * sample_size, 3) ||
-                        cmp_off_by_n(lines0[1], lines1[1], width * sample_size, 3))
+                    if (cmp_off_by_n(lines0[0][0], lines1[0][0], width * sample_size, 3) ||
+                        cmp_off_by_n(lines0[0][1], lines1[0][1], width * sample_size, 3))
                         fail();
                 } else if (dst_pix_fmt == AV_PIX_FMT_RGB565 ||
                            dst_pix_fmt == AV_PIX_FMT_BGR565) {
-                    if (cmp_565_by_n(lines0[0], lines1[0], width, 2) ||
-                        cmp_565_by_n(lines0[1], lines1[1], width, 2))
+                    if (cmp_565_by_n(lines0[0][0], lines1[0][0], width, 2) ||
+                        cmp_565_by_n(lines0[0][1], lines1[0][1], width, 2))
                         fail();
                 } else if (dst_pix_fmt == AV_PIX_FMT_RGB555 ||
                            dst_pix_fmt == AV_PIX_FMT_BGR555) {
-                    if (cmp_555_by_n(lines0[0], lines1[0], width, 2) ||
-                        cmp_555_by_n(lines0[1], lines1[1], width, 2))
+                    if (cmp_555_by_n(lines0[0][0], lines1[0][0], width, 2) ||
+                        cmp_555_by_n(lines0[0][1], lines1[0][1], width, 2))
                         fail();
+                } else if (dst_pix_fmt == AV_PIX_FMT_GBRP) {
+                    for (int p = 0; p < 3; p++)
+                        for (int l = 0; l < 2; l++)
+                            if (cmp_off_by_n(lines0[p][l], lines1[p][l], width, 3))
+                                fail();
                 } else {
                     fail();
                 }
-- 
2.30.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [FFmpeg-devel] [PATCH 3/4] swscale/x86/yuv2rgb: add ssse3 yuv42{0, 2}p -> gbrp unscaled colorspace converters
  2024-07-23 12:46 [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 2/4] swscale/yuv2rgb: add yuv42{0, 2}p -> gbrp unscaled colorspace converters Ramiro Polla
@ 2024-07-23 12:46 ` Ramiro Polla
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 4/4] swscale/aarch64/yuv2rgb: add neon " Ramiro Polla
  2024-07-30 13:05 ` [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
  3 siblings, 0 replies; 7+ messages in thread
From: Ramiro Polla @ 2024-07-23 12:46 UTC (permalink / raw)
  To: ffmpeg-devel

Note: this implementation is limited to x86_64 due to general purpose
      register pressure.
---
 libswscale/x86/yuv2rgb.c     | 39 ++++++++++++++++++++++++++++++++++++
 libswscale/x86/yuv_2_rgb.asm | 24 +++++++++++++++++++++-
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/libswscale/x86/yuv2rgb.c b/libswscale/x86/yuv2rgb.c
index 68e903c6ad..2a4505fa90 100644
--- a/libswscale/x86/yuv2rgb.c
+++ b/libswscale/x86/yuv2rgb.c
@@ -79,6 +79,12 @@ extern void ff_yuva_420_rgb32_ssse3(x86_reg index, uint8_t *image, const uint8_t
 extern void ff_yuva_420_bgr32_ssse3(x86_reg index, uint8_t *image, const uint8_t *pu_index,
                                     const uint8_t *pv_index, const uint64_t *pointer_c_dither,
                                     const uint8_t *py_2index, const uint8_t *pa_2index);
+#if ARCH_X86_64
+extern void ff_yuv_420_gbrp24_ssse3(x86_reg index, uint8_t *image, uint8_t *dst_b, uint8_t *dst_r,
+                                    const uint8_t *pu_index, const uint8_t *pv_index,
+                                    const uint64_t *pointer_c_dither,
+                                    const uint8_t *py_2index);
+#endif
 
 static inline int yuv420_rgb15_ssse3(SwsContext *c, const uint8_t *src[],
                                      int srcStride[],
@@ -201,6 +207,35 @@ static inline int yuv420_bgr24_ssse3(SwsContext *c, const uint8_t *src[],
     return srcSliceH;
 }
 
+#if ARCH_X86_64
+static inline int yuv420_gbrp_ssse3(SwsContext *c, const uint8_t *src[],
+                                    int srcStride[],
+                                    int srcSliceY, int srcSliceH,
+                                    uint8_t *dst[], int dstStride[])
+{
+    int y, h_size, vshift;
+
+    h_size = (c->dstW + 7) & ~7;
+    if (h_size * 3 > FFABS(dstStride[0]))
+        h_size -= 8;
+
+    vshift = c->srcFormat != AV_PIX_FMT_YUV422P;
+
+    for (y = 0; y < srcSliceH; y++) {
+        uint8_t *dst_g    = dst[0] + (y + srcSliceY) * dstStride[0];
+        uint8_t *dst_b    = dst[1] + (y + srcSliceY) * dstStride[1];
+        uint8_t *dst_r    = dst[2] + (y + srcSliceY) * dstStride[2];
+        const uint8_t *py = src[0] +               y * srcStride[0];
+        const uint8_t *pu = src[1] +   (y >> vshift) * srcStride[1];
+        const uint8_t *pv = src[2] +   (y >> vshift) * srcStride[2];
+        x86_reg index = -h_size / 2;
+
+        ff_yuv_420_gbrp24_ssse3(index, dst_g, dst_b, dst_r, pu - index, pv - index, &(c->redDither), py - 2 * index);
+    }
+    return srcSliceH;
+}
+#endif
+
 #endif /* HAVE_X86ASM */
 
 av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)
@@ -234,6 +269,10 @@ av_cold SwsFunc ff_yuv2rgb_init_x86(SwsContext *c)
             return yuv420_rgb16_ssse3;
         case AV_PIX_FMT_RGB555:
             return yuv420_rgb15_ssse3;
+#if ARCH_X86_64
+        case AV_PIX_FMT_GBRP:
+            return yuv420_gbrp_ssse3;
+#endif
         }
     }
 
diff --git a/libswscale/x86/yuv_2_rgb.asm b/libswscale/x86/yuv_2_rgb.asm
index b67ab162d2..eeb1d25942 100644
--- a/libswscale/x86/yuv_2_rgb.asm
+++ b/libswscale/x86/yuv_2_rgb.asm
@@ -32,6 +32,7 @@ mask_dw25  : db  0,  0,  0,  0, -1, -1,  0,  0,  0,  0, -1, -1,  0,  0,  0,  0
 rgb24_shuf1: db  0,  1,  6,  7, 12, 13,  2,  3,  8,  9, 14, 15,  4,  5, 10, 11
 rgb24_shuf2: db 10, 11,  0,  1,  6,  7, 12, 13,  2,  3,  8,  9, 14, 15,  4,  5
 rgb24_shuf3: db  4,  5, 10, 11,  0,  1,  6,  7, 12, 13,  2,  3,  8,  9, 14, 15
+gbrp_shuf  : db  0,  8,  1,  9,  2, 10,  3, 11,  4, 12,  5, 13,  6, 14,  7, 15
 pw_00ff: times 8 dw 255
 pb_f8:   times 16 db 248
 pb_e0:   times 16 db 224
@@ -60,8 +61,13 @@ SECTION .text
     %define GPR_num 6
     %endif
 %else
+    %ifidn %2, gbrp
+    %define parameters index, image, dst_b, dst_r, pu_index, pv_index, pointer_c_dither, py_2index
+    %define GPR_num 8
+    %else
     %define parameters index, image, pu_index, pv_index, pointer_c_dither, py_2index
     %define GPR_num 6
+    %endif
 %endif
 
 %define m_green m2
@@ -172,10 +178,22 @@ cglobal %1_420_%2%3, GPR_num, GPR_num, reg_num, parameters
     paddsw m2, m6 ; G0 G2 G4 G6 ...
 
 %if %3 == 24 ; PACK RGB24
-%define depth 3
     packuswb m0, m3 ; B0 B2 B4 B6 ... B1 B3 B5 B7 ...
     packuswb m1, m5 ; R0 R2 R4 R6 ... R1 R3 R5 R7 ...
     packuswb m2, m7 ; G0 G2 G4 G6 ... G1 G3 G5 G7 ...
+%ifidn %2, gbrp ; PLANAR GBRP
+%define depth 1
+    mova   m4, [gbrp_shuf]
+    pshufb m0, m4
+    pshufb m1, m4
+    pshufb m2, m4
+    movu [imageq], m2
+    movu [dst_bq], m0
+    movu [dst_rq], m1
+    add dst_bq, 8 * depth * time_num
+    add dst_rq, 8 * depth * time_num
+%else
+%define depth 3
     mova m3, m_red
     mova m6, m_blue
     psrldq m_red, 8
@@ -206,6 +224,7 @@ cglobal %1_420_%2%3, GPR_num, GPR_num, reg_num, parameters
     movu [imageq], m0
     movu [imageq + 16], m1
     movu [imageq + 32], m2
+%endif ; PLANAR GBRP
 %else ; PACK RGB15/16/32
     packuswb m0, m1
     packuswb m3, m5
@@ -292,3 +311,6 @@ yuv2rgb_fn yuva, rgb, 32
 yuv2rgb_fn yuva, bgr, 32
 yuv2rgb_fn yuv,  rgb, 15
 yuv2rgb_fn yuv,  rgb, 16
+%if ARCH_X86_64
+yuv2rgb_fn yuv,  gbrp, 24
+%endif
-- 
2.30.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [FFmpeg-devel] [PATCH 4/4] swscale/aarch64/yuv2rgb: add neon yuv42{0, 2}p -> gbrp unscaled colorspace converters
  2024-07-23 12:46 [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 2/4] swscale/yuv2rgb: add yuv42{0, 2}p -> gbrp unscaled colorspace converters Ramiro Polla
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 3/4] swscale/x86/yuv2rgb: add ssse3 " Ramiro Polla
@ 2024-07-23 12:46 ` Ramiro Polla
  2024-07-30 13:05 ` [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
  3 siblings, 0 replies; 7+ messages in thread
From: Ramiro Polla @ 2024-07-23 12:46 UTC (permalink / raw)
  To: ffmpeg-devel

---
 libswscale/aarch64/swscale_unscaled.c | 58 +++++++++++++++++++++
 libswscale/aarch64/yuv2rgb_neon.S     | 73 ++++++++++++++++++++++-----
 2 files changed, 118 insertions(+), 13 deletions(-)

diff --git a/libswscale/aarch64/swscale_unscaled.c b/libswscale/aarch64/swscale_unscaled.c
index b3093bbc9d..5c4f6fee34 100644
--- a/libswscale/aarch64/swscale_unscaled.c
+++ b/libswscale/aarch64/swscale_unscaled.c
@@ -52,11 +52,41 @@ static int ifmt##_to_##ofmt##_neon_wrapper(SwsContext *c, const uint8_t *src[],
                                         c->yuv2rgb_y_coeff);                                \
 }                                                                                           \
 
+#define DECLARE_FF_YUVX_TO_GBRP_FUNCS(ifmt, ofmt)                                           \
+int ff_##ifmt##_to_##ofmt##_neon(int w, int h,                                              \
+                                 uint8_t *dst, int linesize,                                \
+                                 const uint8_t *srcY, int linesizeY,                        \
+                                 const uint8_t *srcU, int linesizeU,                        \
+                                 const uint8_t *srcV, int linesizeV,                        \
+                                 const int16_t *table,                                      \
+                                 int y_offset,                                              \
+                                 int y_coeff,                                               \
+                                 uint8_t *dst1, int linesize1,                              \
+                                 uint8_t *dst2, int linesize2);                             \
+                                                                                            \
+static int ifmt##_to_##ofmt##_neon_wrapper(SwsContext *c, const uint8_t *src[],             \
+                                           int srcStride[], int srcSliceY, int srcSliceH,   \
+                                           uint8_t *dst[], int dstStride[]) {               \
+    const int16_t yuv2rgb_table[] = { YUV_TO_RGB_TABLE };                                   \
+                                                                                            \
+    return ff_##ifmt##_to_##ofmt##_neon(c->srcW, srcSliceH,                                 \
+                                        dst[0] + srcSliceY * dstStride[0], dstStride[0],    \
+                                        src[0], srcStride[0],                               \
+                                        src[1], srcStride[1],                               \
+                                        src[2], srcStride[2],                               \
+                                        yuv2rgb_table,                                      \
+                                        c->yuv2rgb_y_offset >> 6,                           \
+                                        c->yuv2rgb_y_coeff,                                 \
+                                        dst[1] + srcSliceY * dstStride[1], dstStride[1],    \
+                                        dst[2] + srcSliceY * dstStride[2], dstStride[2]);   \
+}                                                                                           \
+
 #define DECLARE_FF_YUVX_TO_ALL_RGBX_FUNCS(yuvx)                                             \
 DECLARE_FF_YUVX_TO_RGBX_FUNCS(yuvx, argb)                                                   \
 DECLARE_FF_YUVX_TO_RGBX_FUNCS(yuvx, rgba)                                                   \
 DECLARE_FF_YUVX_TO_RGBX_FUNCS(yuvx, abgr)                                                   \
 DECLARE_FF_YUVX_TO_RGBX_FUNCS(yuvx, bgra)                                                   \
+DECLARE_FF_YUVX_TO_GBRP_FUNCS(yuvx, gbrp)                                                   \
 
 DECLARE_FF_YUVX_TO_ALL_RGBX_FUNCS(yuv420p)
 DECLARE_FF_YUVX_TO_ALL_RGBX_FUNCS(yuv422p)
@@ -83,11 +113,38 @@ static int ifmt##_to_##ofmt##_neon_wrapper(SwsContext *c, const uint8_t *src[],
                                         c->yuv2rgb_y_coeff);                                \
 }                                                                                           \
 
+#define DECLARE_FF_NVX_TO_GBRP_FUNCS(ifmt, ofmt)                                            \
+int ff_##ifmt##_to_##ofmt##_neon(int w, int h,                                              \
+                                 uint8_t *dst, int linesize,                                \
+                                 const uint8_t *srcY, int linesizeY,                        \
+                                 const uint8_t *srcC, int linesizeC,                        \
+                                 const int16_t *table,                                      \
+                                 int y_offset,                                              \
+                                 int y_coeff,                                               \
+                                 uint8_t *dst1, int linesize1,                              \
+                                 uint8_t *dst2, int linesize2);                             \
+                                                                                            \
+static int ifmt##_to_##ofmt##_neon_wrapper(SwsContext *c, const uint8_t *src[],             \
+                                           int srcStride[], int srcSliceY, int srcSliceH,   \
+                                           uint8_t *dst[], int dstStride[]) {               \
+    const int16_t yuv2rgb_table[] = { YUV_TO_RGB_TABLE };                                   \
+                                                                                            \
+    return ff_##ifmt##_to_##ofmt##_neon(c->srcW, srcSliceH,                                 \
+                                        dst[0] + srcSliceY * dstStride[0], dstStride[0],    \
+                                        src[0], srcStride[0], src[1], srcStride[1],         \
+                                        yuv2rgb_table,                                      \
+                                        c->yuv2rgb_y_offset >> 6,                           \
+                                        c->yuv2rgb_y_coeff,                                 \
+                                        dst[1] + srcSliceY * dstStride[1], dstStride[1],    \
+                                        dst[2] + srcSliceY * dstStride[2], dstStride[2]);   \
+}                                                                                           \
+
 #define DECLARE_FF_NVX_TO_ALL_RGBX_FUNCS(nvx)                                               \
 DECLARE_FF_NVX_TO_RGBX_FUNCS(nvx, argb)                                                     \
 DECLARE_FF_NVX_TO_RGBX_FUNCS(nvx, rgba)                                                     \
 DECLARE_FF_NVX_TO_RGBX_FUNCS(nvx, abgr)                                                     \
 DECLARE_FF_NVX_TO_RGBX_FUNCS(nvx, bgra)                                                     \
+DECLARE_FF_NVX_TO_GBRP_FUNCS(nvx, gbrp)                                                     \
 
 DECLARE_FF_NVX_TO_ALL_RGBX_FUNCS(nv12)
 DECLARE_FF_NVX_TO_ALL_RGBX_FUNCS(nv21)
@@ -110,6 +167,7 @@ DECLARE_FF_NVX_TO_ALL_RGBX_FUNCS(nv21)
     SET_FF_NVX_TO_RGBX_FUNC(nvx, NVX, rgba, RGBA, accurate_rnd);                            \
     SET_FF_NVX_TO_RGBX_FUNC(nvx, NVX, abgr, ABGR, accurate_rnd);                            \
     SET_FF_NVX_TO_RGBX_FUNC(nvx, NVX, bgra, BGRA, accurate_rnd);                            \
+    SET_FF_NVX_TO_RGBX_FUNC(nvx, NVX, gbrp, GBRP, accurate_rnd);                            \
 } while (0)
 
 static void get_unscaled_swscale_neon(SwsContext *c) {
diff --git a/libswscale/aarch64/yuv2rgb_neon.S b/libswscale/aarch64/yuv2rgb_neon.S
index 89d69e7f6c..b89eb2c781 100644
--- a/libswscale/aarch64/yuv2rgb_neon.S
+++ b/libswscale/aarch64/yuv2rgb_neon.S
@@ -30,23 +30,43 @@
 #endif
 .endm
 
-.macro load_args_nv12
+.macro load_dst1_dst2 dst1 linesize1 dst2 linesize2
+#if defined(__APPLE__)
+#define DST_OFFSET 8
+#else
+#define DST_OFFSET 0
+#endif
+        ldr             x10, [sp, #\dst1      - DST_OFFSET]
+        ldr             w12, [sp, #\linesize1 - DST_OFFSET]
+        ldr             x15, [sp, #\dst2      - DST_OFFSET]
+        ldr             w16, [sp, #\linesize2 - DST_OFFSET]
+#undef DST_OFFSET
+        sub             w12, w12, w0                                    // w12 = linesize1 - width     (padding1)
+        sub             w16, w16, w0                                    // w16 = linesize2 - width     (padding2)
+.endm
+
+.macro load_args_nv12 ofmt
         ldr             x8,  [sp]                                       // table
         load_yoff_ycoeff 8, 16                                           // y_offset, y_coeff
         ld1             {v1.1d}, [x8]
         dup             v0.8h, w10
         dup             v3.8h, w9
+.ifc \ofmt,gbrp
+        load_dst1_dst2  24, 32, 40, 48
+        sub             w3, w3, w0                                      // w3 = linesize  - width     (padding)
+.else
         sub             w3, w3, w0, lsl #2                              // w3 = linesize  - width * 4 (padding)
+.endif
         sub             w5, w5, w0                                      // w5 = linesizeY - width     (paddingY)
         sub             w7, w7, w0                                      // w7 = linesizeC - width     (paddingC)
         neg             w11, w0
 .endm
 
-.macro load_args_nv21
-    load_args_nv12
+.macro load_args_nv21 ofmt
+    load_args_nv12 \ofmt
 .endm
 
-.macro load_args_yuv420p
+.macro load_args_yuv420p ofmt
         ldr             x13, [sp]                                       // srcV
         ldr             w14, [sp, #8]                                   // linesizeV
         ldr             x8,  [sp, #16]                                  // table
@@ -54,7 +74,12 @@
         ld1             {v1.1d}, [x8]
         dup             v0.8h, w10
         dup             v3.8h, w9
+.ifc \ofmt,gbrp
+        load_dst1_dst2  40, 48, 56, 64
+        sub             w3, w3, w0                                      // w3 = linesize  - width     (padding)
+.else
         sub             w3, w3, w0, lsl #2                              // w3 = linesize  - width * 4 (padding)
+.endif
         sub             w5, w5, w0                                      // w5 = linesizeY - width     (paddingY)
         sub             w7,  w7,  w0, lsr #1                            // w7  = linesizeU - width / 2 (paddingU)
         sub             w14, w14, w0, lsr #1                            // w14 = linesizeV - width / 2 (paddingV)
@@ -62,7 +87,7 @@
         neg             w11, w11
 .endm
 
-.macro load_args_yuv422p
+.macro load_args_yuv422p ofmt
         ldr             x13, [sp]                                       // srcV
         ldr             w14, [sp, #8]                                   // linesizeV
         ldr             x8,  [sp, #16]                                  // table
@@ -70,7 +95,12 @@
         ld1             {v1.1d}, [x8]
         dup             v0.8h, w10
         dup             v3.8h, w9
+.ifc \ofmt,gbrp
+        load_dst1_dst2  40, 48, 56, 64
+        sub             w3, w3, w0                                      // w3 = linesize  - width     (padding)
+.else
         sub             w3, w3, w0, lsl #2                              // w3 = linesize  - width * 4 (padding)
+.endif
         sub             w5, w5, w0                                      // w5 = linesizeY - width     (paddingY)
         sub             w7,  w7,  w0, lsr #1                            // w7  = linesizeU - width / 2 (paddingU)
         sub             w14, w14, w0, lsr #1                            // w14 = linesizeV - width / 2 (paddingV)
@@ -100,9 +130,9 @@
 .endm
 
 .macro increment_nv12
-        ands            w15, w1, #1
-        csel            w16, w7, w11, ne                                // incC = (h & 1) ? paddincC : -width
-        add             x6,  x6, w16, sxtw                              // srcC += incC
+        ands            w17, w1, #1
+        csel            w17, w7, w11, ne                                // incC = (h & 1) ? paddincC : -width
+        add             x6,  x6, w17, sxtw                              // srcC += incC
 .endm
 
 .macro increment_nv21
@@ -110,10 +140,10 @@
 .endm
 
 .macro increment_yuv420p
-        ands            w15, w1, #1
-        csel            w16,  w7, w11, ne                               // incU = (h & 1) ? paddincU : -width/2
+        ands            w17, w1, #1
+        csel            w17,  w7, w11, ne                               // incU = (h & 1) ? paddincU : -width/2
+        add             x6,  x6,  w17, sxtw                             // srcU += incU
         csel            w17, w14, w11, ne                               // incV = (h & 1) ? paddincV : -width/2
-        add             x6,  x6,  w16, sxtw                             // srcU += incU
         add             x13, x13, w17, sxtw                             // srcV += incV
 .endm
 
@@ -122,7 +152,7 @@
         add             x13, x13, w14, sxtw                             // srcV += incV
 .endm
 
-.macro compute_rgba r1 g1 b1 a1 r2 g2 b2 a2
+.macro compute_rgb r1 g1 b1 r2 g2 b2
         add             v20.8h, v26.8h, v20.8h                          // Y1 + R1
         add             v21.8h, v27.8h, v21.8h                          // Y2 + R2
         add             v22.8h, v26.8h, v22.8h                          // Y1 + G1
@@ -135,13 +165,18 @@
         sqrshrun        \g2, v23.8h, #1                                 // clip_u8((Y2 + G1) >> 1)
         sqrshrun        \b1, v24.8h, #1                                 // clip_u8((Y1 + B1) >> 1)
         sqrshrun        \b2, v25.8h, #1                                 // clip_u8((Y2 + B1) >> 1)
+.endm
+
+.macro compute_rgba r1 g1 b1 a1 r2 g2 b2 a2
+        compute_rgb     \r1, \g1, \b1, \r2, \g2, \b2
         movi            \a1, #255
         movi            \a2, #255
 .endm
 
 .macro declare_func ifmt ofmt
 function ff_\ifmt\()_to_\ofmt\()_neon, export=1
-    load_args_\ifmt
+    load_args_\ifmt \ofmt
+
         mov             w9, w1
 1:
         mov             w8, w0                                          // w8 = width
@@ -185,11 +220,22 @@ function ff_\ifmt\()_to_\ofmt\()_neon, export=1
         compute_rgba    v6.8b,v5.8b,v4.8b,v7.8b, v18.8b,v17.8b,v16.8b,v19.8b
 .endif
 
+.ifc \ofmt,gbrp
+        compute_rgb     v18.8b,v4.8b,v6.8b, v19.8b,v5.8b,v7.8b
+        st1             {  v4.8b,  v5.8b }, [x2],  #16
+        st1             {  v6.8b,  v7.8b }, [x10], #16
+        st1             { v18.8b, v19.8b }, [x15], #16
+.else
         st4             { v4.8b, v5.8b, v6.8b, v7.8b}, [x2], #32
         st4             {v16.8b,v17.8b,v18.8b,v19.8b}, [x2], #32
+.endif
         subs            w8, w8, #16                                     // width -= 16
         b.gt            2b
         add             x2, x2, w3, sxtw                                // dst  += padding
+.ifc \ofmt,gbrp
+        add             x10, x10, w12, sxtw                             // dst1 += padding1
+        add             x15, x15, w16, sxtw                             // dst2 += padding2
+.endif
         add             x4, x4, w5, sxtw                                // srcY += paddingY
     increment_\ifmt
         subs            w1, w1, #1                                      // height -= 1
@@ -204,6 +250,7 @@ endfunc
         declare_func    \ifmt, rgba
         declare_func    \ifmt, abgr
         declare_func    \ifmt, bgra
+        declare_func    \ifmt, gbrp
 .endm
 
 declare_rgb_funcs nv12
-- 
2.30.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb
  2024-07-23 12:46 [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
                   ` (2 preceding siblings ...)
  2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 4/4] swscale/aarch64/yuv2rgb: add neon " Ramiro Polla
@ 2024-07-30 13:05 ` Ramiro Polla
  2024-07-31 11:14   ` Michael Niedermayer
  3 siblings, 1 reply; 7+ messages in thread
From: Ramiro Polla @ 2024-07-30 13:05 UTC (permalink / raw)
  To: ffmpeg-devel

On Tue, Jul 23, 2024 at 2:46 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> This will be used in the upcoming yuv42{0,2}p -> gbrp unscaled
> colorspace converters.

ping on this patchset.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb
  2024-07-30 13:05 ` [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
@ 2024-07-31 11:14   ` Michael Niedermayer
  2024-08-06 10:54     ` Ramiro Polla
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Niedermayer @ 2024-07-31 11:14 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 699 bytes --]

On Tue, Jul 30, 2024 at 03:05:22PM +0200, Ramiro Polla wrote:
> On Tue, Jul 23, 2024 at 2:46 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> > This will be used in the upcoming yuv42{0,2}p -> gbrp unscaled
> > colorspace converters.
> 
> ping on this patchset.

Maybe you can add benchmarks to things changing performance
and, also put a note in commit messages for changes which
change the output

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 2
"100% positive feedback" - "All either got their money back or didnt complain"
"Best seller ever, very honest" - "Seller refunded buyer after failed scam"

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb
  2024-07-31 11:14   ` Michael Niedermayer
@ 2024-08-06 10:54     ` Ramiro Polla
  0 siblings, 0 replies; 7+ messages in thread
From: Ramiro Polla @ 2024-08-06 10:54 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Wed, Jul 31, 2024 at 1:14 PM Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Tue, Jul 30, 2024 at 03:05:22PM +0200, Ramiro Polla wrote:
> > On Tue, Jul 23, 2024 at 2:46 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> > > This will be used in the upcoming yuv42{0,2}p -> gbrp unscaled
> > > colorspace converters.
> >
> > ping on this patchset.
>
> Maybe you can add benchmarks to things changing performance
> and, also put a note in commit messages for changes which
> change the output

For the commit that adds the c unscaled converter, what metric would
be important in the commit log? I would guess performance and ssim
difference from the current behaviour that goes through the swscaler,
but it doesn't seem like we have a tool to do that easily.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-08-06 10:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-23 12:46 [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 2/4] swscale/yuv2rgb: add yuv42{0, 2}p -> gbrp unscaled colorspace converters Ramiro Polla
2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 3/4] swscale/x86/yuv2rgb: add ssse3 " Ramiro Polla
2024-07-23 12:46 ` [FFmpeg-devel] [PATCH 4/4] swscale/aarch64/yuv2rgb: add neon " Ramiro Polla
2024-07-30 13:05 ` [FFmpeg-devel] [PATCH 1/4] swscale/yuv2rgb: prepare YUV2RGBFUNC macro for multi-planar rgb Ramiro Polla
2024-07-31 11:14   ` Michael Niedermayer
2024-08-06 10:54     ` Ramiro Polla

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git