[FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed

* [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support
@ 2022-11-23 19:35 mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 1/4] swscale/input: add rgbaf32 input support mindmark
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: mindmark @ 2022-11-23 19:35 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mark Reid

From: Mark Reid <mindmark@gmail.com>

This patch series adds swscale input/output support for the packed rgb float formats.
A few of the filters also needed support the larger 96/128 bit packed pixel sizes.

I also plan to eventually add lossless unscaled conversions between the planer and packed formats.

changes since v4
* added comment about refactoring input functions
changes since v3
* removed half uv path implementation
changes since v2
* add bias to rgbaf32 output to improve non overflowing range
changes since v1
* output correct alpha if src doesn't have alpha


Mark Reid (4):
  swscale/input: add rgbaf32 input support
  avfilter/vf_hflip: add support for packed rgb float formats
  avfilter/vf_transpose: add support for packed rgb float formats
  swscale/output: add rgbaf32 output support

 libavfilter/vf_hflip_init.h              |  25 +++++
 libavfilter/vf_transpose.c               |  44 ++++++++
 libswscale/input.c                       | 122 +++++++++++++++++++++++
 libswscale/output.c                      |  92 +++++++++++++++++
 libswscale/swscale_unscaled.c            |   4 +-
 libswscale/tests/floatimg_cmp.c          |   4 +-
 libswscale/utils.c                       |  14 ++-
 libswscale/yuv2rgb.c                     |   2 +
 tests/ref/fate/filter-pixdesc-rgbaf32be  |   1 +
 tests/ref/fate/filter-pixdesc-rgbaf32le  |   1 +
 tests/ref/fate/filter-pixdesc-rgbf32be   |   1 +
 tests/ref/fate/filter-pixdesc-rgbf32le   |   1 +
 tests/ref/fate/filter-pixfmts-copy       |   4 +
 tests/ref/fate/filter-pixfmts-crop       |   4 +
 tests/ref/fate/filter-pixfmts-field      |   4 +
 tests/ref/fate/filter-pixfmts-fieldorder |   4 +
 tests/ref/fate/filter-pixfmts-hflip      |   4 +
 tests/ref/fate/filter-pixfmts-il         |   4 +
 tests/ref/fate/filter-pixfmts-null       |   4 +
 tests/ref/fate/filter-pixfmts-scale      |   4 +
 tests/ref/fate/filter-pixfmts-transpose  |   4 +
 tests/ref/fate/filter-pixfmts-vflip      |   4 +
 tests/ref/fate/sws-floatimg-cmp          |  16 +++
 23 files changed, 363 insertions(+), 4 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32le
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32le

--
2.31.1.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [FFmpeg-devel] [PATCH v5 1/4] swscale/input: add rgbaf32 input support
  2022-11-23 19:35 [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support mindmark
@ 2022-11-23 19:35 ` mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 2/4] avfilter/vf_hflip: add support for packed rgb float formats mindmark
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: mindmark @ 2022-11-23 19:35 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mark Reid

From: Mark Reid <mindmark@gmail.com>

The input functions are currently matching the planar f32 functions.
They can be factorized removing multiple lrintf/av_clipf calls,
this will be addressed in a future patch.
---
 libswscale/input.c | 122 +++++++++++++++++++++++++++++++++++++++++++++
 libswscale/utils.c |   6 +++
 2 files changed, 128 insertions(+)

diff --git a/libswscale/input.c b/libswscale/input.c
index d5676062a2..9c9eb31cde 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -1284,6 +1284,98 @@ static void rgbaf16##endian_name##ToA_c(uint8_t *_dst, const uint8_t *_src, cons
 rgbaf16_funcs_endian(le, 0)
 rgbaf16_funcs_endian(be, 1)
 
+#define rdpx(src) (is_be ? av_int2float(AV_RB32(&src)): av_int2float(AV_RL32(&src)))
+
+static av_always_inline void rgbaf32ToUV_endian(uint16_t *dstU, uint16_t *dstV, int is_be,
+                                                const float *src, int width,
+                                                int32_t *rgb2yuv, int comp)
+{
+    int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX];
+    int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX];
+    int i;
+    /*TODO: refactor these f32 conversions to only have one lrintf and av_clipf call*/
+    for (i = 0; i < width; i++) {
+        int r = lrintf(av_clipf(65535.0f * rdpx(src[i*comp+0]), 0.0f, 65535.0f));
+        int g = lrintf(av_clipf(65535.0f * rdpx(src[i*comp+1]), 0.0f, 65535.0f));
+        int b = lrintf(av_clipf(65535.0f * rdpx(src[i*comp+2]), 0.0f, 65535.0f));
+
+        dstU[i] = (ru*r + gu*g + bu*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+        dstV[i] = (rv*r + gv*g + bv*b + (0x10001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+    }
+}
+
+static av_always_inline void rgbaf32ToY_endian(uint16_t *dst, const float *src, int is_be,
+                                               int width, int32_t *rgb2yuv, int comp)
+{
+    int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX];
+    int i;
+    /*TODO: refactor these f32 conversions to only have one lrintf and av_clipf call*/
+    for (i = 0; i < width; i++) {
+        int r = lrintf(av_clipf(65535.0f * rdpx(src[i*comp+0]), 0.0f, 65535.0f));
+        int g = lrintf(av_clipf(65535.0f * rdpx(src[i*comp+1]), 0.0f, 65535.0f));
+        int b = lrintf(av_clipf(65535.0f * rdpx(src[i*comp+2]), 0.0f, 65535.0f));
+
+        dst[i] = (ry*r + gy*g + by*b + (0x2001<<(RGB2YUV_SHIFT-1))) >> RGB2YUV_SHIFT;
+    }
+}
+
+static av_always_inline void rgbaf32ToA_endian(uint16_t *dst, const float *src, int is_be,
+                                               int width, void *opq)
+{
+    int i;
+    for (i=0; i<width; i++) {
+        dst[i] = lrintf(av_clipf(65535.0f * rdpx(src[i*4+3]), 0.0f, 65535.0f));
+    }
+}
+
+#undef rdpx
+
+#define rgbaf32_funcs_endian(endian_name, endian)                                                         \
+static void rgbf32##endian_name##ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused,            \
+                                         const uint8_t *src1, const uint8_t *src2,                        \
+                                         int width, uint32_t *rgb2yuv, void *opq)                         \
+{                                                                                                         \
+    const float *src = (const float*)src1;                                                                \
+    uint16_t *dstU = (uint16_t*)_dstU;                                                                    \
+    uint16_t *dstV = (uint16_t*)_dstV;                                                                    \
+    av_assert1(src1==src2);                                                                               \
+    rgbaf32ToUV_endian(dstU, dstV, endian, src, width, rgb2yuv, 3);                                       \
+}                                                                                                         \
+static void rgbf32##endian_name##ToY_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0,        \
+                                        const uint8_t *unused1, int width, uint32_t *rgb2yuv, void *opq)  \
+{                                                                                                         \
+    const float *src = (const float*)_src;                                                                \
+    uint16_t *dst = (uint16_t*)_dst;                                                                      \
+    rgbaf32ToY_endian(dst, src, endian, width, rgb2yuv, 3);                                               \
+}                                                                                                         \
+static void rgbaf32##endian_name##ToUV_c(uint8_t *_dstU, uint8_t *_dstV, const uint8_t *unused,           \
+                                         const uint8_t *src1, const uint8_t *src2,                        \
+                                         int width, uint32_t *rgb2yuv, void *opq)                         \
+{                                                                                                         \
+    const float *src = (const float*)src1;                                                                \
+    uint16_t *dstU = (uint16_t*)_dstU;                                                                    \
+    uint16_t *dstV = (uint16_t*)_dstV;                                                                    \
+    av_assert1(src1==src2);                                                                               \
+    rgbaf32ToUV_endian(dstU, dstV, endian, src, width, rgb2yuv, 4);                                       \
+}                                                                                                         \
+static void rgbaf32##endian_name##ToY_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0,       \
+                                        const uint8_t *unused1, int width, uint32_t *rgb2yuv, void *opq)  \
+{                                                                                                         \
+    const float *src = (const float*)_src;                                                                \
+    uint16_t *dst = (uint16_t*)_dst;                                                                      \
+    rgbaf32ToY_endian(dst, src, endian, width, rgb2yuv, 4);                                               \
+}                                                                                                         \
+static void rgbaf32##endian_name##ToA_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused0,       \
+                                        const uint8_t *unused1, int width, uint32_t *unused2, void *opq)  \
+{                                                                                                         \
+    const float *src = (const float*)_src;                                                                \
+    uint16_t *dst = (uint16_t*)_dst;                                                                      \
+    rgbaf32ToA_endian(dst, src, endian, width, opq);                                                      \
+}
+
+rgbaf32_funcs_endian(le, 0)
+rgbaf32_funcs_endian(be, 1)
+
 av_cold void ff_sws_init_input_funcs(SwsContext *c)
 {
     enum AVPixelFormat srcFormat = c->srcFormat;
@@ -1663,6 +1755,18 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
         case AV_PIX_FMT_RGBAF16LE:
             c->chrToYV12 = rgbaf16leToUV_c;
             break;
+        case AV_PIX_FMT_RGBF32BE:
+            c->chrToYV12 = rgbf32beToUV_c;
+            break;
+        case AV_PIX_FMT_RGBAF32BE:
+            c->chrToYV12 = rgbaf32beToUV_c;
+            break;
+        case AV_PIX_FMT_RGBF32LE:
+            c->chrToYV12 = rgbf32leToUV_c;
+            break;
+        case AV_PIX_FMT_RGBAF32LE:
+            c->chrToYV12 = rgbaf32leToUV_c;
+            break;
         }
     }
 
@@ -1973,6 +2077,18 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
     case AV_PIX_FMT_RGBAF16LE:
         c->lumToYV12 = rgbaf16leToY_c;
         break;
+    case AV_PIX_FMT_RGBF32BE:
+        c->lumToYV12 = rgbf32beToY_c;
+        break;
+    case AV_PIX_FMT_RGBAF32BE:
+        c->lumToYV12 = rgbaf32beToY_c;
+        break;
+    case AV_PIX_FMT_RGBF32LE:
+        c->lumToYV12 = rgbf32leToY_c;
+        break;
+    case AV_PIX_FMT_RGBAF32LE:
+        c->lumToYV12 = rgbaf32leToY_c;
+        break;
     }
     if (c->needAlpha) {
         if (is16BPS(srcFormat) || isNBPS(srcFormat)) {
@@ -1998,6 +2114,12 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c)
         case AV_PIX_FMT_RGBAF16LE:
             c->alpToYV12 = rgbaf16leToA_c;
             break;
+        case AV_PIX_FMT_RGBAF32BE:
+            c->alpToYV12 = rgbaf32beToA_c;
+            break;
+        case AV_PIX_FMT_RGBAF32LE:
+            c->alpToYV12 = rgbaf32leToA_c;
+            break;
         case AV_PIX_FMT_YA8:
             c->alpToYV12 = uyvyToY_c;
             break;
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 85640a143f..2c520f68d1 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -266,6 +266,10 @@ static const FormatEntry format_entries[] = {
     [AV_PIX_FMT_VUYX]        = { 1, 1 },
     [AV_PIX_FMT_RGBAF16BE]   = { 1, 0 },
     [AV_PIX_FMT_RGBAF16LE]   = { 1, 0 },
+    [AV_PIX_FMT_RGBF32BE]    = { 1, 0 },
+    [AV_PIX_FMT_RGBF32LE]    = { 1, 0 },
+    [AV_PIX_FMT_RGBAF32BE]   = { 1, 0 },
+    [AV_PIX_FMT_RGBAF32LE]   = { 1, 0 },
     [AV_PIX_FMT_XV30LE]      = { 1, 1 },
     [AV_PIX_FMT_XV36LE]      = { 1, 1 },
 };
@@ -1572,6 +1576,8 @@ av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter,
         srcFormat != AV_PIX_FMT_GBRAP16BE  && srcFormat != AV_PIX_FMT_GBRAP16LE &&
         srcFormat != AV_PIX_FMT_GBRPF32BE  && srcFormat != AV_PIX_FMT_GBRPF32LE &&
         srcFormat != AV_PIX_FMT_GBRAPF32BE && srcFormat != AV_PIX_FMT_GBRAPF32LE &&
+        srcFormat != AV_PIX_FMT_RGBF32BE   && srcFormat != AV_PIX_FMT_RGBF32LE  &&
+        srcFormat != AV_PIX_FMT_RGBAF32BE  && srcFormat != AV_PIX_FMT_RGBAF32LE  &&
         ((dstW >> c->chrDstHSubSample) <= (srcW >> 1) ||
          (flags & SWS_FAST_BILINEAR)))
         c->chrSrcHSubSample = 1;
-- 
2.31.1.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [FFmpeg-devel] [PATCH v5 2/4] avfilter/vf_hflip: add support for packed rgb float formats
  2022-11-23 19:35 [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 1/4] swscale/input: add rgbaf32 input support mindmark
@ 2022-11-23 19:35 ` mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 3/4] avfilter/vf_transpose: " mindmark
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: mindmark @ 2022-11-23 19:35 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mark Reid

From: Mark Reid <mindmark@gmail.com>

---
 libavfilter/vf_hflip_init.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/libavfilter/vf_hflip_init.h b/libavfilter/vf_hflip_init.h
index d0319f463d..31173f73fc 100644
--- a/libavfilter/vf_hflip_init.h
+++ b/libavfilter/vf_hflip_init.h
@@ -86,6 +86,29 @@ static void hflip_qword_c(const uint8_t *ssrc, uint8_t *ddst, int w)
         dst[j] = src[-j];
 }
 
+static void hflip_b96_c(const uint8_t *ssrc, uint8_t *ddst, int w)
+{
+    const uint32_t *in = (const uint32_t *)ssrc;
+    uint32_t *out = (uint32_t *)ddst;
+
+    for (int j = 0; j < w; j++, out += 3, in -= 3) {
+        out[0] = in[0];
+        out[1] = in[1];
+        out[2] = in[2];
+    }
+}
+
+static void hflip_b128_c(const uint8_t *ssrc, uint8_t *ddst, int w)
+{
+    const uint64_t *in = (const uint64_t *)ssrc;
+    uint64_t *out = (uint64_t *)ddst;
+
+    for (int j = 0; j < w; j++, out += 2, in -= 2) {
+        out[0] = in[0];
+        out[1] = in[1];
+    }
+}
+
 static av_unused int ff_hflip_init(FlipContext *s, int step[4], int nb_planes)
 {
     for (int i = 0; i < nb_planes; i++) {
@@ -97,6 +120,8 @@ static av_unused int ff_hflip_init(FlipContext *s, int step[4], int nb_planes)
         case 4: s->flip_line[i] = hflip_dword_c; break;
         case 6: s->flip_line[i] = hflip_b48_c;   break;
         case 8: s->flip_line[i] = hflip_qword_c; break;
+        case 12: s->flip_line[i] = hflip_b96_c; break;
+        case 16: s->flip_line[i] = hflip_b128_c; break;
         default:
             return AVERROR_BUG;
         }
-- 
2.31.1.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [FFmpeg-devel] [PATCH v5 3/4] avfilter/vf_transpose: add support for packed rgb float formats
  2022-11-23 19:35 [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 1/4] swscale/input: add rgbaf32 input support mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 2/4] avfilter/vf_hflip: add support for packed rgb float formats mindmark
@ 2022-11-23 19:35 ` mindmark
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support mindmark
  2022-12-04 21:48 ` [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support Mark Reid
  4 siblings, 0 replies; 8+ messages in thread
From: mindmark @ 2022-11-23 19:35 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mark Reid

From: Mark Reid <mindmark@gmail.com>

---
 libavfilter/vf_transpose.c | 44 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/libavfilter/vf_transpose.c b/libavfilter/vf_transpose.c
index 469e66729f..1023d6fe82 100644
--- a/libavfilter/vf_transpose.c
+++ b/libavfilter/vf_transpose.c
@@ -174,6 +174,46 @@ static void transpose_8x8_64_c(uint8_t *src, ptrdiff_t src_linesize,
     transpose_block_64_c(src, src_linesize, dst, dst_linesize, 8, 8);
 }
 
+static inline void transpose_block_96_c(uint8_t *src, ptrdiff_t src_linesize,
+                                        uint8_t *dst, ptrdiff_t dst_linesize,
+                                        int w, int h)
+{
+    int x, y;
+    for (y = 0; y < h; y++, dst += dst_linesize, src += 12) {
+        for (x = 0; x < w; x++) {
+            *((uint32_t *)(dst+0 + 12*x)) = *((uint32_t *)(src+0 + x*src_linesize));
+            *((uint32_t *)(dst+4 + 12*x)) = *((uint32_t *)(src+4 + x*src_linesize));
+            *((uint32_t *)(dst+8 + 12*x)) = *((uint32_t *)(src+8 + x*src_linesize));
+        }
+    }
+}
+
+static void transpose_8x8_96_c(uint8_t *src, ptrdiff_t src_linesize,
+                               uint8_t *dst, ptrdiff_t dst_linesize)
+{
+    transpose_block_96_c(src, src_linesize, dst, dst_linesize, 8, 8);
+}
+
+
+static inline void transpose_block_128_c(uint8_t *src, ptrdiff_t src_linesize,
+                                         uint8_t *dst, ptrdiff_t dst_linesize,
+                                         int w, int h)
+{
+    int x, y;
+    for (y = 0; y < h; y++, dst += dst_linesize, src += 16) {
+        for (x = 0; x < w; x++) {
+            *((uint64_t *)(dst+0 + 16*x)) = *((uint64_t *)(src+0 + x*src_linesize));
+            *((uint64_t *)(dst+8 + 16*x)) = *((uint64_t *)(src+8 + x*src_linesize));
+        }
+    }
+}
+
+static void transpose_8x8_128_c(uint8_t *src, ptrdiff_t src_linesize,
+                                uint8_t *dst, ptrdiff_t dst_linesize)
+{
+    transpose_block_128_c(src, src_linesize, dst, dst_linesize, 8, 8);
+}
+
 static int config_props_output(AVFilterLink *outlink)
 {
     AVFilterContext *ctx = outlink->src;
@@ -232,6 +272,10 @@ static int config_props_output(AVFilterLink *outlink)
                 v->transpose_8x8   = transpose_8x8_48_c; break;
         case 8: v->transpose_block = transpose_block_64_c;
                 v->transpose_8x8   = transpose_8x8_64_c; break;
+        case 12: v->transpose_block = transpose_block_96_c;
+                 v->transpose_8x8   = transpose_8x8_96_c; break;
+        case 16: v->transpose_block = transpose_block_128_c;
+                 v->transpose_8x8   = transpose_8x8_128_c; break;
         }
     }
 
-- 
2.31.1.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support
  2022-11-23 19:35 [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support mindmark
                   ` (2 preceding siblings ...)
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 3/4] avfilter/vf_transpose: " mindmark
@ 2022-11-23 19:35 ` mindmark
  2022-12-05  0:05   ` Michael Niedermayer
  2022-12-04 21:48 ` [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support Mark Reid
  4 siblings, 1 reply; 8+ messages in thread
From: mindmark @ 2022-11-23 19:35 UTC (permalink / raw)
  To: ffmpeg-devel; +Cc: Mark Reid

From: Mark Reid <mindmark@gmail.com>

---
 libswscale/output.c                      | 92 ++++++++++++++++++++++++
 libswscale/swscale_unscaled.c            |  4 +-
 libswscale/tests/floatimg_cmp.c          |  4 +-
 libswscale/utils.c                       | 16 +++--
 libswscale/yuv2rgb.c                     |  2 +
 tests/ref/fate/filter-pixdesc-rgbaf32be  |  1 +
 tests/ref/fate/filter-pixdesc-rgbaf32le  |  1 +
 tests/ref/fate/filter-pixdesc-rgbf32be   |  1 +
 tests/ref/fate/filter-pixdesc-rgbf32le   |  1 +
 tests/ref/fate/filter-pixfmts-copy       |  4 ++
 tests/ref/fate/filter-pixfmts-crop       |  4 ++
 tests/ref/fate/filter-pixfmts-field      |  4 ++
 tests/ref/fate/filter-pixfmts-fieldorder |  4 ++
 tests/ref/fate/filter-pixfmts-hflip      |  4 ++
 tests/ref/fate/filter-pixfmts-il         |  4 ++
 tests/ref/fate/filter-pixfmts-null       |  4 ++
 tests/ref/fate/filter-pixfmts-scale      |  4 ++
 tests/ref/fate/filter-pixfmts-transpose  |  4 ++
 tests/ref/fate/filter-pixfmts-vflip      |  4 ++
 tests/ref/fate/sws-floatimg-cmp          | 16 +++++
 20 files changed, 170 insertions(+), 8 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32le
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32le

diff --git a/libswscale/output.c b/libswscale/output.c
index 5c85bff971..1d86a244f9 100644
--- a/libswscale/output.c
+++ b/libswscale/output.c
@@ -2471,6 +2471,92 @@ yuv2gbrpf32_full_X_c(SwsContext *c, const int16_t *lumFilter,
     }
 }
 
+static void
+yuv2rgbaf32_full_X_c(SwsContext *c, const int16_t *lumFilter,
+                    const int16_t **lumSrcx, int lumFilterSize,
+                    const int16_t *chrFilter, const int16_t **chrUSrcx,
+                    const int16_t **chrVSrcx, int chrFilterSize,
+                    const int16_t **alpSrcx, uint8_t *dest,
+                    int dstW, int y)
+{
+    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(c->dstFormat);
+    int i;
+    int alpha = desc->flags & AV_PIX_FMT_FLAG_ALPHA;
+    int hasAlpha = alpha && alpSrcx;
+    int pixelStep = alpha ? 4 : 3;
+    uint32_t *dest32 = (uint32_t*)dest;
+    const int32_t **lumSrc  = (const int32_t**)lumSrcx;
+    const int32_t **chrUSrc = (const int32_t**)chrUSrcx;
+    const int32_t **chrVSrc = (const int32_t**)chrVSrcx;
+    const int32_t **alpSrc  = (const int32_t**)alpSrcx;
+    static const float float_mult = 1.0f / 65535.0f;
+    uint32_t a = av_float2int(1.0f);
+
+    for (i = 0; i < dstW; i++) {
+        int j;
+        int Y = -0x40000000;
+        int U = -(128 << 23);
+        int V = -(128 << 23);
+        int R, G, B, A;
+
+        for (j = 0; j < lumFilterSize; j++)
+            Y += lumSrc[j][i] * (unsigned)lumFilter[j];
+
+        for (j = 0; j < chrFilterSize; j++) {
+            U += chrUSrc[j][i] * (unsigned)chrFilter[j];
+            V += chrVSrc[j][i] * (unsigned)chrFilter[j];
+        }
+
+        Y >>= 14;
+        Y += 0x10000;
+        U >>= 14;
+        V >>= 14;
+
+        if (hasAlpha) {
+            A = -0x40000000;
+
+            for (j = 0; j < lumFilterSize; j++)
+                A += alpSrc[j][i] * (unsigned)lumFilter[j];
+
+            A >>= 1;
+            A += 0x20002000;
+            a = av_float2int(float_mult * (float)(av_clip_uintp2(A, 30) >> 14));
+        }
+
+        Y -= c->yuv2rgb_y_offset;
+        Y *= c->yuv2rgb_y_coeff;
+        Y += (1 << 13) - (1 << 29);
+        R = V * c->yuv2rgb_v2r_coeff;
+        G = V * c->yuv2rgb_v2g_coeff + U * c->yuv2rgb_u2g_coeff;
+        B =                            U * c->yuv2rgb_u2b_coeff;
+
+        R = av_clip_uintp2(((Y + R) >> 14) + (1<<15), 16);
+        G = av_clip_uintp2(((Y + G) >> 14) + (1<<15), 16);
+        B = av_clip_uintp2(((Y + B) >> 14) + (1<<15), 16);
+
+        dest32[0] = av_float2int(float_mult * (float)R);
+        dest32[1] = av_float2int(float_mult * (float)G);
+        dest32[2] = av_float2int(float_mult * (float)B);
+        if (alpha)
+            dest32[3] = a;
+
+        dest32 += pixelStep;
+    }
+    if ((!isBE(c->dstFormat)) != (!HAVE_BIGENDIAN)) {
+        dest32 = (uint32_t*)dest;
+        for (i = 0; i < dstW; i++) {
+            dest32[0] = av_bswap32(dest32[0]);
+            dest32[1] = av_bswap32(dest32[1]);
+            dest32[2] = av_bswap32(dest32[2]);
+            if (alpha)
+                dest32[3] = av_bswap32(dest32[3]);
+
+            dest32 += pixelStep;
+        }
+    }
+
+}
+
 static void
 yuv2ya8_1_c(SwsContext *c, const int16_t *buf0,
             const int16_t *ubuf[2], const int16_t *vbuf[2],
@@ -2983,6 +3069,12 @@ av_cold void ff_sws_init_output_funcs(SwsContext *c,
             }
             break;
 
+        case AV_PIX_FMT_RGBF32LE:
+        case AV_PIX_FMT_RGBF32BE:
+        case AV_PIX_FMT_RGBAF32LE:
+        case AV_PIX_FMT_RGBAF32BE:
+            *yuv2packedX = yuv2rgbaf32_full_X_c;
+            break;
         case AV_PIX_FMT_RGB24:
             *yuv2packedX = yuv2rgb24_full_X_c;
             *yuv2packed2 = yuv2rgb24_full_2_c;
diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index 9af2e7ecc3..5a73cfa541 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -2160,7 +2160,9 @@ void ff_get_unscaled_swscale(SwsContext *c)
 
     /* bswap 32 bits per pixel/component formats */
     if (IS_DIFFERENT_ENDIANESS(srcFormat, dstFormat, AV_PIX_FMT_GBRPF32) ||
-        IS_DIFFERENT_ENDIANESS(srcFormat, dstFormat, AV_PIX_FMT_GBRAPF32))
+        IS_DIFFERENT_ENDIANESS(srcFormat, dstFormat, AV_PIX_FMT_GBRAPF32) ||
+        IS_DIFFERENT_ENDIANESS(srcFormat, dstFormat, AV_PIX_FMT_RGBF32)   ||
+        IS_DIFFERENT_ENDIANESS(srcFormat, dstFormat, AV_PIX_FMT_RGBAF32))
         c->convert_unscaled = bswap_32bpc;
 
     if (usePal(srcFormat) && isByteRGB(dstFormat))
diff --git a/libswscale/tests/floatimg_cmp.c b/libswscale/tests/floatimg_cmp.c
index 5c67594fb6..9559c93aac 100644
--- a/libswscale/tests/floatimg_cmp.c
+++ b/libswscale/tests/floatimg_cmp.c
@@ -54,7 +54,9 @@ static const enum AVPixelFormat pix_fmts[] = {
     AV_PIX_FMT_GBRP10LE, AV_PIX_FMT_GBRAP10LE,
     AV_PIX_FMT_GBRP12LE, AV_PIX_FMT_GBRAP12LE,
     AV_PIX_FMT_GBRP14LE,
-    AV_PIX_FMT_GBRP16LE,  AV_PIX_FMT_GBRAP16LE
+    AV_PIX_FMT_GBRP16LE, AV_PIX_FMT_GBRAP16LE,
+    AV_PIX_FMT_RGBF32LE, AV_PIX_FMT_RGBAF32LE,
+    AV_PIX_FMT_RGBF32BE, AV_PIX_FMT_RGBAF32BE
 };
 
 const char *usage =  "floatimg_cmp -pixel_format <pix_fmt> -size <image_size> -ref <testfile>\n";
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 2c520f68d1..642f765550 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -266,10 +266,10 @@ static const FormatEntry format_entries[] = {
     [AV_PIX_FMT_VUYX]        = { 1, 1 },
     [AV_PIX_FMT_RGBAF16BE]   = { 1, 0 },
     [AV_PIX_FMT_RGBAF16LE]   = { 1, 0 },
-    [AV_PIX_FMT_RGBF32BE]    = { 1, 0 },
-    [AV_PIX_FMT_RGBF32LE]    = { 1, 0 },
-    [AV_PIX_FMT_RGBAF32BE]   = { 1, 0 },
-    [AV_PIX_FMT_RGBAF32LE]   = { 1, 0 },
+    [AV_PIX_FMT_RGBF32BE]    = { 1, 1 },
+    [AV_PIX_FMT_RGBF32LE]    = { 1, 1 },
+    [AV_PIX_FMT_RGBAF32BE]   = { 1, 1 },
+    [AV_PIX_FMT_RGBAF32LE]   = { 1, 1 },
     [AV_PIX_FMT_XV30LE]      = { 1, 1 },
     [AV_PIX_FMT_XV36LE]      = { 1, 1 },
 };
@@ -1512,7 +1512,7 @@ av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter,
             }
         }
     }
-    if (isPlanarRGB(dstFormat)) {
+    if (isPlanarRGB(dstFormat) || isFloat(dstFormat)) {
         if (!(flags & SWS_FULL_CHR_H_INT)) {
             av_log(c, AV_LOG_DEBUG,
                    "%s output is not supported with half chroma resolution, switching to full\n",
@@ -1544,7 +1544,11 @@ av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter,
         dstFormat != AV_PIX_FMT_BGR4_BYTE &&
         dstFormat != AV_PIX_FMT_RGB4_BYTE &&
         dstFormat != AV_PIX_FMT_BGR8 &&
-        dstFormat != AV_PIX_FMT_RGB8
+        dstFormat != AV_PIX_FMT_RGB8 &&
+        dstFormat != AV_PIX_FMT_RGBF32LE &&
+        dstFormat != AV_PIX_FMT_RGBF32BE &&
+        dstFormat != AV_PIX_FMT_RGBAF32LE &&
+        dstFormat != AV_PIX_FMT_RGBAF32BE
     ) {
         av_log(c, AV_LOG_WARNING,
                "full chroma interpolation for destination format '%s' not yet implemented\n",
diff --git a/libswscale/yuv2rgb.c b/libswscale/yuv2rgb.c
index 9c3f5e23c6..7ad9d2c9dc 100644
--- a/libswscale/yuv2rgb.c
+++ b/libswscale/yuv2rgb.c
@@ -999,6 +999,8 @@ av_cold int ff_yuv2rgb_c_init_tables(SwsContext *c, const int inv_table[4],
         break;
     case 32:
     case 64:
+    case 96:
+    case 128:
         base      = (c->dstFormat == AV_PIX_FMT_RGB32_1 ||
                      c->dstFormat == AV_PIX_FMT_BGR32_1) ? 8 : 0;
         rbase     = base + (isRgb ? 16 : 0);
diff --git a/tests/ref/fate/filter-pixdesc-rgbaf32be b/tests/ref/fate/filter-pixdesc-rgbaf32be
new file mode 100644
index 0000000000..def2537716
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-rgbaf32be
@@ -0,0 +1 @@
+pixdesc-rgbaf32be   8c618ffd38084857013b0f50d2cee0dd
diff --git a/tests/ref/fate/filter-pixdesc-rgbaf32le b/tests/ref/fate/filter-pixdesc-rgbaf32le
new file mode 100644
index 0000000000..01913ecaa7
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-rgbaf32le
@@ -0,0 +1 @@
+pixdesc-rgbaf32le   bb5239a00a3ec2b1f1d569733a1e03a2
diff --git a/tests/ref/fate/filter-pixdesc-rgbf32be b/tests/ref/fate/filter-pixdesc-rgbf32be
new file mode 100644
index 0000000000..77e40f20b9
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-rgbf32be
@@ -0,0 +1 @@
+pixdesc-rgbf32be    ca6c4115dc368a192a67d06bd47f369f
diff --git a/tests/ref/fate/filter-pixdesc-rgbf32le b/tests/ref/fate/filter-pixdesc-rgbf32le
new file mode 100644
index 0000000000..c0cb4d60f0
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-rgbf32le
@@ -0,0 +1 @@
+pixdesc-rgbf32le    290153561ddc3266b253fa075e040578
diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy
index b28a114c7b..9aa653d12d 100644
--- a/tests/ref/fate/filter-pixfmts-copy
+++ b/tests/ref/fate/filter-pixfmts-copy
@@ -90,6 +90,10 @@ rgb8                7ac6008c84d622c2fc50581706e17576
 rgba                b6e1b441c365e03b5ffdf9b7b68d9a0c
 rgba64be            ae2ae04b5efedca3505f47c4dd6ea6ea
 rgba64le            b91e1d77f799eb92241a2d2d28437b15
+rgbaf32be           abb244bb1af2247bbba7b4eac0357f6b
+rgbaf32le           99b2afe809649e58eea0a188fd6ab3f2
+rgbf32be            e4a0b47e8ecb2e7e461915227ebd4edd
+rgbf32le            2979d6bfe509a55486e55e4de63d67d1
 uyvy422             3bcf3c80047592f2211fae3260b1b65d
 vuya                3d5e934651cae1ce334001cb1829ad22
 vuyx                3f68ea6ec492b30d867cb5401562264e
diff --git a/tests/ref/fate/filter-pixfmts-crop b/tests/ref/fate/filter-pixfmts-crop
index bdb2536f7d..9bb54cf9f0 100644
--- a/tests/ref/fate/filter-pixfmts-crop
+++ b/tests/ref/fate/filter-pixfmts-crop
@@ -88,6 +88,10 @@ rgb8                9b364a8f112ad9459fec47a51cc03b30
 rgba                9488ac85abceaf99a9309eac5a87697e
 rgba64be            89910046972ab3c68e2a348302cc8ca9
 rgba64le            fea8ebfc869b52adf353778f29eac7a7
+rgbaf32be           48be3ace31f293fa2bf174c583c46b50
+rgbaf32le           a2a358ecc5fe53a684d3de994134c4ce
+rgbf32be            ad982fbaf08c908658e66cb4330257cd
+rgbf32le            14b944af62af458fa53da118721a258f
 vuya                76578a705ff3a37559653c1289bd03dd
 vuyx                5d2bae51a2f4892bd5f177f190cc323b
 x2bgr10le           84de725b85662c362862820dc4a309aa
diff --git a/tests/ref/fate/filter-pixfmts-field b/tests/ref/fate/filter-pixfmts-field
index 4e5a798471..5c8817a5c7 100644
--- a/tests/ref/fate/filter-pixfmts-field
+++ b/tests/ref/fate/filter-pixfmts-field
@@ -90,6 +90,10 @@ rgb8                62c3b9e2a171de3d894a8eeb271c85e8
 rgba                ee616262ca6d67b7ecfba4b36c602ce3
 rgba64be            23c8c0edaabe3eaec89ce69633fb0048
 rgba64le            dfdba4de4a7cac9abf08852666c341d3
+rgbaf32be           f081b6c6df6094d36563b78dccdbc97a
+rgbaf32le           e461aff4408751a03f8effd0e16be47c
+rgbf32be            d71855f12b960ab02b787c99bc0e4cf1
+rgbf32le            a4251c66110290418f15cb1e334a7f0d
 uyvy422             1c49e44ab3f060e85fc4a3a9464f045e
 vuya                f72bcf29d75cd143d0c565f7cc49119a
 vuyx                6257cd1ce11330660e9fa9c675acbdcc
diff --git a/tests/ref/fate/filter-pixfmts-fieldorder b/tests/ref/fate/filter-pixfmts-fieldorder
index bebaf07371..05b288dbe3 100644
--- a/tests/ref/fate/filter-pixfmts-fieldorder
+++ b/tests/ref/fate/filter-pixfmts-fieldorder
@@ -79,6 +79,10 @@ rgb8                6deae05ccac5c50bd0d9c9fe8e124557
 rgba                1fdf872a087a32cd35b80cc7be399578
 rgba64be            5598f44514d122b9a57c5c92c20bbc61
 rgba64le            b34e6e30621ae579519a2d91a96a0acf
+rgbaf32be           9a7c2182fd89218172f07e3b37f9728e
+rgbaf32le           940e87c0ddb7a86f9df438591b4870e6
+rgbf32be            f23c5dd85e312471e3def0204051d264
+rgbf32le            fec06f561663e7d01dec08c1c337f068
 uyvy422             75de70e31c435dde878002d3f22b238a
 vuya                a3891d4168ff208948fd0b3ba0910495
 vuyx                d7a900e970c9a69ed41f8b220114b9fa
diff --git a/tests/ref/fate/filter-pixfmts-hflip b/tests/ref/fate/filter-pixfmts-hflip
index fd5e9723fd..02a734d8af 100644
--- a/tests/ref/fate/filter-pixfmts-hflip
+++ b/tests/ref/fate/filter-pixfmts-hflip
@@ -88,6 +88,10 @@ rgb8                68a3a575badadd9e4f90226209f11699
 rgba                51961c723ea6707e0a410cd3f21f15d3
 rgba64be            c910444019f4cfbf4d995227af55da8d
 rgba64le            0c810d8b3a6bca10321788e1cb145340
+rgbaf32be           6077603ed554db4149f82d891e1d1120
+rgbaf32le           3081376a3a9e90fd2849751dfd69094b
+rgbf32be            0e5ffacba8dc768acfa50fc590dddac6
+rgbf32le            542380fd2490e3227b70373eace3bade
 vuya                7e530261e7ac4eae4fd616fd7572d0b8
 vuyx                3ce9890363cad3984521293be1eb679c
 x2bgr10le           827cc659f29378e00c5a7d2c0ada8f9a
diff --git a/tests/ref/fate/filter-pixfmts-il b/tests/ref/fate/filter-pixfmts-il
index ec9d809721..48701fe9a3 100644
--- a/tests/ref/fate/filter-pixfmts-il
+++ b/tests/ref/fate/filter-pixfmts-il
@@ -89,6 +89,10 @@ rgb8                93f9fa5ecf522abe13ed34f21831fdfe
 rgba                625d8f4bd39c4bdbf61eb5e4713aecc9
 rgba64be            db70d33aa6c06f3e0a1c77bd11284261
 rgba64le            a8a2daae04374a27219bc1c890204007
+rgbaf32be           e803073624dee4ce4c10d5d22f95b788
+rgbaf32le           f87c6898ef385fb7239b16677a450326
+rgbf32be            3696160d95d41564a1a375ddeab82d70
+rgbf32le            71d9d0c83ded4c0866ea0673c1eed8d0
 uyvy422             d6ee3ca43356d08c392382b24b22cda5
 vuya                b9deab5ba249dd608b709c09255a4932
 vuyx                49cc92fcc002ec0f312017014dd68c0c
diff --git a/tests/ref/fate/filter-pixfmts-null b/tests/ref/fate/filter-pixfmts-null
index b28a114c7b..9aa653d12d 100644
--- a/tests/ref/fate/filter-pixfmts-null
+++ b/tests/ref/fate/filter-pixfmts-null
@@ -90,6 +90,10 @@ rgb8                7ac6008c84d622c2fc50581706e17576
 rgba                b6e1b441c365e03b5ffdf9b7b68d9a0c
 rgba64be            ae2ae04b5efedca3505f47c4dd6ea6ea
 rgba64le            b91e1d77f799eb92241a2d2d28437b15
+rgbaf32be           abb244bb1af2247bbba7b4eac0357f6b
+rgbaf32le           99b2afe809649e58eea0a188fd6ab3f2
+rgbf32be            e4a0b47e8ecb2e7e461915227ebd4edd
+rgbf32le            2979d6bfe509a55486e55e4de63d67d1
 uyvy422             3bcf3c80047592f2211fae3260b1b65d
 vuya                3d5e934651cae1ce334001cb1829ad22
 vuyx                3f68ea6ec492b30d867cb5401562264e
diff --git a/tests/ref/fate/filter-pixfmts-scale b/tests/ref/fate/filter-pixfmts-scale
index 525306ec12..b22a6d639d 100644
--- a/tests/ref/fate/filter-pixfmts-scale
+++ b/tests/ref/fate/filter-pixfmts-scale
@@ -90,6 +90,10 @@ rgb8                bcdc033b4ef0979d060dbc8893d4db58
 rgba                85bb5d03cea1c6e8002ced3373904336
 rgba64be            ee73e57923af984b31cc7795d13929da
 rgba64le            783d2779adfafe3548bdb671ec0de69e
+rgbaf32be           0ccc31b373d6465ab72247ae42f395ce
+rgbaf32le           3eb06a5ba1c2341f12a30ff2539d5de0
+rgbf32be            11d0d9722abf9ea4fdce370920ff5820
+rgbf32le            a9c247ff3d50e07bd7d0f777b9f82b32
 uyvy422             aeb4ba4f9f003ae21f6d18089198244f
 vuya                ffa817e283bf6a0b6fba21b07523ccaa
 vuyx                ba182200e20e0c82765eba15217848d3
diff --git a/tests/ref/fate/filter-pixfmts-transpose b/tests/ref/fate/filter-pixfmts-transpose
index 24f4249639..868a784731 100644
--- a/tests/ref/fate/filter-pixfmts-transpose
+++ b/tests/ref/fate/filter-pixfmts-transpose
@@ -82,6 +82,10 @@ rgb8                c90feb30c3c9391ef5f470209d7b7a15
 rgba                4d76a9542143752a4ac30f82f88f68f1
 rgba64be            a60041217f4c0cd796d19d3940a12a41
 rgba64le            ad47197774858858ae7b0c177dffa459
+rgbaf32be           83d7d66f59a8b62c40aa59c81188584d
+rgbaf32le           d9cb46bfebebadaf838bb13d243feabe
+rgbf32be            b090912ac60da12be5f84672d223300f
+rgbf32le            4b4e06621fb626952fc632f5049453ba
 vuya                9ece18a345beb17cd19e09e443eca4bf
 vuyx                4c2929cd1c6e5512f62e802f482f0ef2
 x2bgr10le           4aa774b6d8f6d446a64f1f288e5c97eb
diff --git a/tests/ref/fate/filter-pixfmts-vflip b/tests/ref/fate/filter-pixfmts-vflip
index b7b0526588..b999109ff8 100644
--- a/tests/ref/fate/filter-pixfmts-vflip
+++ b/tests/ref/fate/filter-pixfmts-vflip
@@ -90,6 +90,10 @@ rgb8                7df049b6094f8a5e084d74462f6d6cde
 rgba                c1a5908572737f2ae1e5d8218af65f4b
 rgba64be            17e6273323b5779b5f3f775f150c1011
 rgba64le            48f45b10503b7dd140329c3dd0d54c98
+rgbaf32be           cd7a7ce6f70ef8ca4a05d69458c6c40d
+rgbaf32le           4516303140dd7ff5bc02e4ae0c94b46d
+rgbf32be            f416f773dd17dd15ca0845b47bf5ff44
+rgbf32le            16c50e897557972203c7e637080dc885
 uyvy422             3a237e8376264e0cfa78f8a3fdadec8a
 vuya                fb849f76e56181e005c31fce75d7038c
 vuyx                7a8079a97610e2c1c97aa8832b58a102
diff --git a/tests/ref/fate/sws-floatimg-cmp b/tests/ref/fate/sws-floatimg-cmp
index 251042f1c3..d524dd612f 100644
--- a/tests/ref/fate/sws-floatimg-cmp
+++ b/tests/ref/fate/sws-floatimg-cmp
@@ -118,3 +118,19 @@ gbrpf32le -> gbrap16le -> gbrpf32le
 avg diff: 0.000249
 min diff: 0.000000
 max diff: 0.000990
+gbrpf32le -> rgbf32le -> gbrpf32le
+avg diff: 0.000249
+min diff: 0.000000
+max diff: 0.000990
+gbrpf32le -> rgbaf32le -> gbrpf32le
+avg diff: 0.000249
+min diff: 0.000000
+max diff: 0.000990
+gbrpf32le -> rgbf32be -> gbrpf32le
+avg diff: 0.000249
+min diff: 0.000000
+max diff: 0.000990
+gbrpf32le -> rgbaf32be -> gbrpf32le
+avg diff: 0.000249
+min diff: 0.000000
+max diff: 0.000990
-- 
2.31.1.windows.1

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support
  2022-11-23 19:35 [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support mindmark
                   ` (3 preceding siblings ...)
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support mindmark
@ 2022-12-04 21:48 ` Mark Reid
  4 siblings, 0 replies; 8+ messages in thread
From: Mark Reid @ 2022-12-04 21:48 UTC (permalink / raw)
  To: ffmpeg-devel

On Wed, Nov 23, 2022 at 11:35 AM <mindmark@gmail.com> wrote:

> From: Mark Reid <mindmark@gmail.com>
>
> This patch series adds swscale input/output support for the packed rgb
> float formats.
> A few of the filters also needed support the larger 96/128 bit packed
> pixel sizes.
>
> I also plan to eventually add lossless unscaled conversions between the
> planer and packed formats.
>
> changes since v4
> * added comment about refactoring input functions
> changes since v3
> * removed half uv path implementation
> changes since v2
> * add bias to rgbaf32 output to improve non overflowing range
> changes since v1
> * output correct alpha if src doesn't have alpha
>
>
> Mark Reid (4):
>   swscale/input: add rgbaf32 input support
>   avfilter/vf_hflip: add support for packed rgb float formats
>   avfilter/vf_transpose: add support for packed rgb float formats
>   swscale/output: add rgbaf32 output support
>
>  libavfilter/vf_hflip_init.h              |  25 +++++
>  libavfilter/vf_transpose.c               |  44 ++++++++
>  libswscale/input.c                       | 122 +++++++++++++++++++++++
>  libswscale/output.c                      |  92 +++++++++++++++++
>  libswscale/swscale_unscaled.c            |   4 +-
>  libswscale/tests/floatimg_cmp.c          |   4 +-
>  libswscale/utils.c                       |  14 ++-
>  libswscale/yuv2rgb.c                     |   2 +
>  tests/ref/fate/filter-pixdesc-rgbaf32be  |   1 +
>  tests/ref/fate/filter-pixdesc-rgbaf32le  |   1 +
>  tests/ref/fate/filter-pixdesc-rgbf32be   |   1 +
>  tests/ref/fate/filter-pixdesc-rgbf32le   |   1 +
>  tests/ref/fate/filter-pixfmts-copy       |   4 +
>  tests/ref/fate/filter-pixfmts-crop       |   4 +
>  tests/ref/fate/filter-pixfmts-field      |   4 +
>  tests/ref/fate/filter-pixfmts-fieldorder |   4 +
>  tests/ref/fate/filter-pixfmts-hflip      |   4 +
>  tests/ref/fate/filter-pixfmts-il         |   4 +
>  tests/ref/fate/filter-pixfmts-null       |   4 +
>  tests/ref/fate/filter-pixfmts-scale      |   4 +
>  tests/ref/fate/filter-pixfmts-transpose  |   4 +
>  tests/ref/fate/filter-pixfmts-vflip      |   4 +
>  tests/ref/fate/sws-floatimg-cmp          |  16 +++
>  23 files changed, 363 insertions(+), 4 deletions(-)
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32be
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32le
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32be
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32le
>
> --
> 2.31.1.windows.1
>
>
ping
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support
  2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support mindmark
@ 2022-12-05  0:05   ` Michael Niedermayer
  2022-12-05  5:39     ` Mark Reid
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Niedermayer @ 2022-12-05  0:05 UTC (permalink / raw)
  To: FFmpeg development discussions and patches


[-- Attachment #1.1: Type: text/plain, Size: 5455 bytes --]

On Wed, Nov 23, 2022 at 11:35:40AM -0800, mindmark@gmail.com wrote:
> From: Mark Reid <mindmark@gmail.com>
> 
> ---
>  libswscale/output.c                      | 92 ++++++++++++++++++++++++
>  libswscale/swscale_unscaled.c            |  4 +-
>  libswscale/tests/floatimg_cmp.c          |  4 +-
>  libswscale/utils.c                       | 16 +++--
>  libswscale/yuv2rgb.c                     |  2 +
>  tests/ref/fate/filter-pixdesc-rgbaf32be  |  1 +
>  tests/ref/fate/filter-pixdesc-rgbaf32le  |  1 +
>  tests/ref/fate/filter-pixdesc-rgbf32be   |  1 +
>  tests/ref/fate/filter-pixdesc-rgbf32le   |  1 +
>  tests/ref/fate/filter-pixfmts-copy       |  4 ++
>  tests/ref/fate/filter-pixfmts-crop       |  4 ++
>  tests/ref/fate/filter-pixfmts-field      |  4 ++
>  tests/ref/fate/filter-pixfmts-fieldorder |  4 ++
>  tests/ref/fate/filter-pixfmts-hflip      |  4 ++
>  tests/ref/fate/filter-pixfmts-il         |  4 ++
>  tests/ref/fate/filter-pixfmts-null       |  4 ++
>  tests/ref/fate/filter-pixfmts-scale      |  4 ++
>  tests/ref/fate/filter-pixfmts-transpose  |  4 ++
>  tests/ref/fate/filter-pixfmts-vflip      |  4 ++
>  tests/ref/fate/sws-floatimg-cmp          | 16 +++++
>  20 files changed, 170 insertions(+), 8 deletions(-)
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32be
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32le
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32be
>  create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32le
> 
> diff --git a/libswscale/output.c b/libswscale/output.c
> index 5c85bff971..1d86a244f9 100644
> --- a/libswscale/output.c
> +++ b/libswscale/output.c
> @@ -2471,6 +2471,92 @@ yuv2gbrpf32_full_X_c(SwsContext *c, const int16_t *lumFilter,
>      }
>  }
>  
> +static void
> +yuv2rgbaf32_full_X_c(SwsContext *c, const int16_t *lumFilter,
> +                    const int16_t **lumSrcx, int lumFilterSize,
> +                    const int16_t *chrFilter, const int16_t **chrUSrcx,
> +                    const int16_t **chrVSrcx, int chrFilterSize,
> +                    const int16_t **alpSrcx, uint8_t *dest,
> +                    int dstW, int y)
> +{
> +    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(c->dstFormat);
> +    int i;
> +    int alpha = desc->flags & AV_PIX_FMT_FLAG_ALPHA;
> +    int hasAlpha = alpha && alpSrcx;
> +    int pixelStep = alpha ? 4 : 3;
> +    uint32_t *dest32 = (uint32_t*)dest;
> +    const int32_t **lumSrc  = (const int32_t**)lumSrcx;
> +    const int32_t **chrUSrc = (const int32_t**)chrUSrcx;
> +    const int32_t **chrVSrc = (const int32_t**)chrVSrcx;
> +    const int32_t **alpSrc  = (const int32_t**)alpSrcx;
> +    static const float float_mult = 1.0f / 65535.0f;
> +    uint32_t a = av_float2int(1.0f);
> +
> +    for (i = 0; i < dstW; i++) {
> +        int j;
> +        int Y = -0x40000000;
> +        int U = -(128 << 23);
> +        int V = -(128 << 23);
> +        int R, G, B, A;
> +
> +        for (j = 0; j < lumFilterSize; j++)
> +            Y += lumSrc[j][i] * (unsigned)lumFilter[j];
> +
> +        for (j = 0; j < chrFilterSize; j++) {
> +            U += chrUSrc[j][i] * (unsigned)chrFilter[j];
> +            V += chrVSrc[j][i] * (unsigned)chrFilter[j];
> +        }
> +
> +        Y >>= 14;
> +        Y += 0x10000;
> +        U >>= 14;
> +        V >>= 14;
> +
> +        if (hasAlpha) {
> +            A = -0x40000000;
> +
> +            for (j = 0; j < lumFilterSize; j++)
> +                A += alpSrc[j][i] * (unsigned)lumFilter[j];
> +
> +            A >>= 1;
> +            A += 0x20002000;
> +            a = av_float2int(float_mult * (float)(av_clip_uintp2(A, 30) >> 14));
> +        }
> +
> +        Y -= c->yuv2rgb_y_offset;
> +        Y *= c->yuv2rgb_y_coeff;
> +        Y += (1 << 13) - (1 << 29);
> +        R = V * c->yuv2rgb_v2r_coeff;
> +        G = V * c->yuv2rgb_v2g_coeff + U * c->yuv2rgb_u2g_coeff;
> +        B =                            U * c->yuv2rgb_u2b_coeff;
> +
> +        R = av_clip_uintp2(((Y + R) >> 14) + (1<<15), 16);
> +        G = av_clip_uintp2(((Y + G) >> 14) + (1<<15), 16);
> +        B = av_clip_uintp2(((Y + B) >> 14) + (1<<15), 16);
> +
> +        dest32[0] = av_float2int(float_mult * (float)R);
> +        dest32[1] = av_float2int(float_mult * (float)G);
> +        dest32[2] = av_float2int(float_mult * (float)B);
> +        if (alpha)
> +            dest32[3] = a;

why is this using uint32_t with av_float2int() and not floats straight ?



> +
> +        dest32 += pixelStep;
> +    }
> +    if ((!isBE(c->dstFormat)) != (!HAVE_BIGENDIAN)) {
> +        dest32 = (uint32_t*)dest;
> +        for (i = 0; i < dstW; i++) {
> +            dest32[0] = av_bswap32(dest32[0]);
> +            dest32[1] = av_bswap32(dest32[1]);
> +            dest32[2] = av_bswap32(dest32[2]);
> +            if (alpha)
> +                dest32[3] = av_bswap32(dest32[3]);
> +
> +            dest32 += pixelStep;
> +        }
> +    }

teh code in bswapdsp seems more efficient, that should be shared and
used ideally

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User
questions about the command line tools should be sent to the ffmpeg-user ML.
And questions about how to use libav* should be sent to the libav-user ML.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 251 bytes --]

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support
  2022-12-05  0:05   ` Michael Niedermayer
@ 2022-12-05  5:39     ` Mark Reid
  0 siblings, 0 replies; 8+ messages in thread
From: Mark Reid @ 2022-12-05  5:39 UTC (permalink / raw)
  To: FFmpeg development discussions and patches

On Sun, Dec 4, 2022 at 4:05 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Wed, Nov 23, 2022 at 11:35:40AM -0800, mindmark@gmail.com wrote:
> > From: Mark Reid <mindmark@gmail.com>
> >
> > ---
> >  libswscale/output.c                      | 92 ++++++++++++++++++++++++
> >  libswscale/swscale_unscaled.c            |  4 +-
> >  libswscale/tests/floatimg_cmp.c          |  4 +-
> >  libswscale/utils.c                       | 16 +++--
> >  libswscale/yuv2rgb.c                     |  2 +
> >  tests/ref/fate/filter-pixdesc-rgbaf32be  |  1 +
> >  tests/ref/fate/filter-pixdesc-rgbaf32le  |  1 +
> >  tests/ref/fate/filter-pixdesc-rgbf32be   |  1 +
> >  tests/ref/fate/filter-pixdesc-rgbf32le   |  1 +
> >  tests/ref/fate/filter-pixfmts-copy       |  4 ++
> >  tests/ref/fate/filter-pixfmts-crop       |  4 ++
> >  tests/ref/fate/filter-pixfmts-field      |  4 ++
> >  tests/ref/fate/filter-pixfmts-fieldorder |  4 ++
> >  tests/ref/fate/filter-pixfmts-hflip      |  4 ++
> >  tests/ref/fate/filter-pixfmts-il         |  4 ++
> >  tests/ref/fate/filter-pixfmts-null       |  4 ++
> >  tests/ref/fate/filter-pixfmts-scale      |  4 ++
> >  tests/ref/fate/filter-pixfmts-transpose  |  4 ++
> >  tests/ref/fate/filter-pixfmts-vflip      |  4 ++
> >  tests/ref/fate/sws-floatimg-cmp          | 16 +++++
> >  20 files changed, 170 insertions(+), 8 deletions(-)
> >  create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32be
> >  create mode 100644 tests/ref/fate/filter-pixdesc-rgbaf32le
> >  create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32be
> >  create mode 100644 tests/ref/fate/filter-pixdesc-rgbf32le
> >
> > diff --git a/libswscale/output.c b/libswscale/output.c
> > index 5c85bff971..1d86a244f9 100644
> > --- a/libswscale/output.c
> > +++ b/libswscale/output.c
> > @@ -2471,6 +2471,92 @@ yuv2gbrpf32_full_X_c(SwsContext *c, const int16_t
> *lumFilter,
> >      }
> >  }
> >
> > +static void
> > +yuv2rgbaf32_full_X_c(SwsContext *c, const int16_t *lumFilter,
> > +                    const int16_t **lumSrcx, int lumFilterSize,
> > +                    const int16_t *chrFilter, const int16_t **chrUSrcx,
> > +                    const int16_t **chrVSrcx, int chrFilterSize,
> > +                    const int16_t **alpSrcx, uint8_t *dest,
> > +                    int dstW, int y)
> > +{
> > +    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(c->dstFormat);
> > +    int i;
> > +    int alpha = desc->flags & AV_PIX_FMT_FLAG_ALPHA;
> > +    int hasAlpha = alpha && alpSrcx;
> > +    int pixelStep = alpha ? 4 : 3;
> > +    uint32_t *dest32 = (uint32_t*)dest;
> > +    const int32_t **lumSrc  = (const int32_t**)lumSrcx;
> > +    const int32_t **chrUSrc = (const int32_t**)chrUSrcx;
> > +    const int32_t **chrVSrc = (const int32_t**)chrVSrcx;
> > +    const int32_t **alpSrc  = (const int32_t**)alpSrcx;
> > +    static const float float_mult = 1.0f / 65535.0f;
> > +    uint32_t a = av_float2int(1.0f);
> > +
> > +    for (i = 0; i < dstW; i++) {
> > +        int j;
> > +        int Y = -0x40000000;
> > +        int U = -(128 << 23);
> > +        int V = -(128 << 23);
> > +        int R, G, B, A;
> > +
> > +        for (j = 0; j < lumFilterSize; j++)
> > +            Y += lumSrc[j][i] * (unsigned)lumFilter[j];
> > +
> > +        for (j = 0; j < chrFilterSize; j++) {
> > +            U += chrUSrc[j][i] * (unsigned)chrFilter[j];
> > +            V += chrVSrc[j][i] * (unsigned)chrFilter[j];
> > +        }
> > +
> > +        Y >>= 14;
> > +        Y += 0x10000;
> > +        U >>= 14;
> > +        V >>= 14;
> > +
> > +        if (hasAlpha) {
> > +            A = -0x40000000;
> > +
> > +            for (j = 0; j < lumFilterSize; j++)
> > +                A += alpSrc[j][i] * (unsigned)lumFilter[j];
> > +
> > +            A >>= 1;
> > +            A += 0x20002000;
> > +            a = av_float2int(float_mult * (float)(av_clip_uintp2(A, 30)
> >> 14));
> > +        }
> > +
> > +        Y -= c->yuv2rgb_y_offset;
> > +        Y *= c->yuv2rgb_y_coeff;
> > +        Y += (1 << 13) - (1 << 29);
> > +        R = V * c->yuv2rgb_v2r_coeff;
> > +        G = V * c->yuv2rgb_v2g_coeff + U * c->yuv2rgb_u2g_coeff;
> > +        B =                            U * c->yuv2rgb_u2b_coeff;
> > +
> > +        R = av_clip_uintp2(((Y + R) >> 14) + (1<<15), 16);
> > +        G = av_clip_uintp2(((Y + G) >> 14) + (1<<15), 16);
> > +        B = av_clip_uintp2(((Y + B) >> 14) + (1<<15), 16);
> > +
> > +        dest32[0] = av_float2int(float_mult * (float)R);
> > +        dest32[1] = av_float2int(float_mult * (float)G);
> > +        dest32[2] = av_float2int(float_mult * (float)B);
> > +        if (alpha)
> > +            dest32[3] = a;
>
> why is this using uint32_t with av_float2int() and not floats straight ?
>
>
It's this way because it is matching the planar f32 version, I will change
both.


>
>
> > +
> > +        dest32 += pixelStep;
> > +    }
> > +    if ((!isBE(c->dstFormat)) != (!HAVE_BIGENDIAN)) {
> > +        dest32 = (uint32_t*)dest;
> > +        for (i = 0; i < dstW; i++) {
> > +            dest32[0] = av_bswap32(dest32[0]);
> > +            dest32[1] = av_bswap32(dest32[1]);
> > +            dest32[2] = av_bswap32(dest32[2]);
> > +            if (alpha)
> > +                dest32[3] = av_bswap32(dest32[3]);
> > +
> > +            dest32 += pixelStep;
> > +        }
> > +    }
>
> teh code in bswapdsp seems more efficient, that should be shared and
> used ideally
>

I just sent a patch moving the bswapdsp code from avcodec to avutil


>
> thx
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker.
> User
> questions about the command line tools should be sent to the ffmpeg-user
> ML.
> And questions about how to use libav* should be sent to the libav-user ML.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-05  5:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 19:35 [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support mindmark
2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 1/4] swscale/input: add rgbaf32 input support mindmark
2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 2/4] avfilter/vf_hflip: add support for packed rgb float formats mindmark
2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 3/4] avfilter/vf_transpose: " mindmark
2022-11-23 19:35 ` [FFmpeg-devel] [PATCH v5 4/4] swscale/output: add rgbaf32 output support mindmark
2022-12-05  0:05   ` Michael Niedermayer
2022-12-05  5:39     ` Mark Reid
2022-12-04 21:48 ` [FFmpeg-devel] [PATCH v5 0/4] swscale: rgbaf32 input/output support Mark Reid

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git