Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 help / color / mirror / Atom feed
* [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips
@ 2024-07-25 20:25 Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
  To: ffmpeg-devel

Unlike x86, fmin/fmax are single instructions, not function calls. They
are much much faster than doing a comparison, then branching based on its
results. With this, audiodsp.vector_clipf gets almost twice as fast, and
a properly unrollled version of it gets 4-5x faster, on SiFive-U74.
This is only the low-hanging fruit: FFMIN and FFMAX are presumably
affected as well.

This likely applies to other instruction sets with native IEEE floats,
especially those lacking a conditional select instruction.
---
 libavutil/riscv/intmath.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h
index 3e7ab864c5..24f165eef1 100644
--- a/libavutil/riscv/intmath.h
+++ b/libavutil/riscv/intmath.h
@@ -22,6 +22,7 @@
 #define AVUTIL_RISCV_INTMATH_H
 
 #include <stdint.h>
+#include <math.h>
 
 #include "config.h"
 #include "libavutil/attributes.h"
@@ -72,6 +73,24 @@ static av_always_inline av_const int av_clip_intp2_rvi(int a, int p)
     return b;
 }
 
+#if defined (__riscv_f) || defined (__riscv_zfinx)
+#define av_clipf av_clipf_rvf
+static av_always_inline av_const float av_clipf_rvf(float a, float min,
+                                                    float max)
+{
+    return fminf(fmaxf(a, min), max);
+}
+#endif
+
+#if defined (__riscv_d) || defined (__riscv_zdinx)
+#define av_clipd av_clipd_rvd
+static av_always_inline av_const float av_clipd_rvd(double a, double min,
+                                                    double max)
+{
+    return fmin(fmax(a, min), max);
+}
+#endif
+
 #if defined (__GNUC__) || defined (__clang__)
 static inline av_const int ff_ctz_rv(int x)
 {
-- 
2.45.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf
  2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation Rémi Denis-Courmont
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
  To: ffmpeg-devel

Given that source and destination can alias, the compiler was forced to
perform each read-modify-write sequentially. We cannot use the `restrict`
qualifier to avoid this here because the AC-3 encoder uses the function
in-place. Instead this commit provides an explicit guarantee to the
compiler that batches of 8 elements will not overlap, so that it can
interleave calculations.

In practice contemporary optimising compilers are able to unroll and keep
the temporary array in FPU registers (without spilling).

On SiFive-U74, this speeds the same signs branch by 4x, and the
opposite signs branch 1.5x.
---
 libavcodec/audiodsp.c | 40 +++++++++++++++++-----------------------
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c
index c5427d3535..9e83f06aaa 100644
--- a/libavcodec/audiodsp.c
+++ b/libavcodec/audiodsp.c
@@ -38,41 +38,35 @@ static inline float clipf_c_one(float a, uint32_t mini,
 static void vector_clipf_c_opposite_sign(float *dst, const float *src,
                                          float min, float max, int len)
 {
-    int i;
     uint32_t mini        = av_float2int(min);
     uint32_t maxi        = av_float2int(max);
     uint32_t maxisign    = maxi ^ (1U << 31);
 
-    for (i = 0; i < len; i += 8) {
-        dst[i + 0] = clipf_c_one(src[i + 0], mini, maxi, maxisign);
-        dst[i + 1] = clipf_c_one(src[i + 1], mini, maxi, maxisign);
-        dst[i + 2] = clipf_c_one(src[i + 2], mini, maxi, maxisign);
-        dst[i + 3] = clipf_c_one(src[i + 3], mini, maxi, maxisign);
-        dst[i + 4] = clipf_c_one(src[i + 4], mini, maxi, maxisign);
-        dst[i + 5] = clipf_c_one(src[i + 5], mini, maxi, maxisign);
-        dst[i + 6] = clipf_c_one(src[i + 6], mini, maxi, maxisign);
-        dst[i + 7] = clipf_c_one(src[i + 7], mini, maxi, maxisign);
+    for (int i = 0; i < len; i += 8) {
+        float tmp[8];
+
+        for (int j = 0; j < 8; j++)
+            tmp[j]= clipf_c_one(src[i + j], mini, maxi, maxisign);
+        for (int j = 0; j < 8; j++)
+            dst[i + j] = tmp[j];
     }
 }
 
 static void vector_clipf_c(float *dst, const float *src, int len,
                            float min, float max)
 {
-    int i;
-
     if (min < 0 && max > 0) {
         vector_clipf_c_opposite_sign(dst, src, min, max, len);
-    } else {
-        for (i = 0; i < len; i += 8) {
-            dst[i]     = av_clipf(src[i], min, max);
-            dst[i + 1] = av_clipf(src[i + 1], min, max);
-            dst[i + 2] = av_clipf(src[i + 2], min, max);
-            dst[i + 3] = av_clipf(src[i + 3], min, max);
-            dst[i + 4] = av_clipf(src[i + 4], min, max);
-            dst[i + 5] = av_clipf(src[i + 5], min, max);
-            dst[i + 6] = av_clipf(src[i + 6], min, max);
-            dst[i + 7] = av_clipf(src[i + 7], min, max);
-        }
+        return;
+    }
+
+    for (int i = 0; i < len; i += 8) {
+        float tmp[8];
+
+        for (int j = 0; j < 8; j++)
+            tmp[j]= av_clipf(src[i + j], min, max);
+        for (int j = 0; j < 8; j++)
+            dst[i + j] = tmp[j];
     }
 }
 
-- 
2.45.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation
  2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf Rémi Denis-Courmont
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
  To: ffmpeg-devel

This was added along side the original SSE(one) DSP function in
0a68cd876e14f76a00df7bb8edbfeb350f8ef617 without rationale. This was
presumably faster on x87, which is no longer relevant since we pretty
much assume SSE2 or later on x86.

Meanwhile this function is ~2.5x slower than the normal floating point
one on SiFive-U74.
---
 libavcodec/audiodsp.c | 35 -----------------------------------
 1 file changed, 35 deletions(-)

diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c
index 9e83f06aaa..fd6a00345f 100644
--- a/libavcodec/audiodsp.c
+++ b/libavcodec/audiodsp.c
@@ -22,44 +22,9 @@
 #include "libavutil/common.h"
 #include "audiodsp.h"
 
-static inline float clipf_c_one(float a, uint32_t mini,
-                                uint32_t maxi, uint32_t maxisign)
-{
-    uint32_t ai = av_float2int(a);
-
-    if (ai > mini)
-        return av_int2float(mini);
-    else if ((ai ^ (1U << 31)) > maxisign)
-        return av_int2float(maxi);
-    else
-        return a;
-}
-
-static void vector_clipf_c_opposite_sign(float *dst, const float *src,
-                                         float min, float max, int len)
-{
-    uint32_t mini        = av_float2int(min);
-    uint32_t maxi        = av_float2int(max);
-    uint32_t maxisign    = maxi ^ (1U << 31);
-
-    for (int i = 0; i < len; i += 8) {
-        float tmp[8];
-
-        for (int j = 0; j < 8; j++)
-            tmp[j]= clipf_c_one(src[i + j], mini, maxi, maxisign);
-        for (int j = 0; j < 8; j++)
-            dst[i + j] = tmp[j];
-    }
-}
-
 static void vector_clipf_c(float *dst, const float *src, int len,
                            float min, float max)
 {
-    if (min < 0 && max > 0) {
-        vector_clipf_c_opposite_sign(dst, src, min, max, len);
-        return;
-    }
-
     for (int i = 0; i < len; i += 8) {
         float tmp[8];
 
-- 
2.45.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf
  2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions Rémi Denis-Courmont
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
  To: ffmpeg-devel

This is now firmly slower than C.

SiFive-U74 (cycles):
audiodsp.vector_clipf_c:   31.2
audiodsp.vector_clipf_rvf: 39.5
---
 libavcodec/riscv/Makefile        |  1 -
 libavcodec/riscv/audiodsp_init.c |  8 +----
 libavcodec/riscv/audiodsp_rvf.S  | 50 --------------------------------
 3 files changed, 1 insertion(+), 58 deletions(-)
 delete mode 100644 libavcodec/riscv/audiodsp_rvf.S

diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index 0bbdd38116..621c099a5b 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -9,7 +9,6 @@ RVVB-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvvb.o
 OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o
 RVV-OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_rvv.o
 OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o
-RV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvf.o
 RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
 OBJS-$(CONFIG_BLOCKDSP) += riscv/blockdsp_init.o
 RVV-OBJS-$(CONFIG_BLOCKDSP) += riscv/blockdsp_rvv.o
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index f606406429..5750d4d8a7 100644
--- a/libavcodec/riscv/audiodsp_init.c
+++ b/libavcodec/riscv/audiodsp_init.c
@@ -24,8 +24,6 @@
 #include "libavutil/cpu.h"
 #include "libavcodec/audiodsp.h"
 
-void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max);
-
 int32_t ff_scalarproduct_int16_rvv(const int16_t *v1, const int16_t *v2, int len);
 void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min,
                               int32_t max, unsigned int len);
@@ -33,12 +31,9 @@ void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float
 
 av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c)
 {
-#if HAVE_RV
+#if HAVE_RVV
     int flags = av_get_cpu_flags();
 
-    if (flags & AV_CPU_FLAG_RVF)
-        c->vector_clipf = ff_vector_clipf_rvf;
-#if HAVE_RVV
     if (flags & AV_CPU_FLAG_RVB_ADDR) {
         if (flags & AV_CPU_FLAG_RVV_I32) {
             c->scalarproduct_int16 = ff_scalarproduct_int16_rvv;
@@ -48,5 +43,4 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c)
             c->vector_clipf = ff_vector_clipf_rvv;
     }
 #endif
-#endif
 }
diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S
deleted file mode 100644
index 97aa930ab5..0000000000
--- a/libavcodec/riscv/audiodsp_rvf.S
+++ /dev/null
@@ -1,50 +0,0 @@
-/*
- * Copyright © 2022 Rémi Denis-Courmont.
- *
- * This file is part of FFmpeg.
- *
- * FFmpeg is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * FFmpeg is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with FFmpeg; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
- */
-
-#include "libavutil/riscv/asm.S"
-
-func ff_vector_clipf_rvf, f
-        lpad    0
-NOHWF   fmv.w.x fa0, a3
-NOHWF   fmv.w.x fa1, a4
-1:
-        flw     ft0,   (a1)
-        flw     ft1,  4(a1)
-        fmax.s  ft0, ft0, fa0
-        flw     ft2,  8(a1)
-        fmax.s  ft1, ft1, fa0
-        flw     ft3, 12(a1)
-        fmax.s  ft2, ft2, fa0
-        addi    a2, a2, -4
-        fmax.s  ft3, ft3, fa0
-        addi    a1, a1, 16
-        fmin.s  ft0, ft0, fa1
-        fmin.s  ft1, ft1, fa1
-        fsw     ft0,   (a0)
-        fmin.s  ft2, ft2, fa1
-        fsw     ft1,  4(a0)
-        fmin.s  ft3, ft3, fa1
-        fsw     ft2,  8(a0)
-        fsw     ft3, 12(a0)
-        addi    a0, a0, 16
-        bnez    a2, 1b
-
-        ret
-endfunc
-- 
2.45.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions
  2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
                   ` (2 preceding siblings ...)
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
  2024-07-26  6:23 ` [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
  5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
  To: ffmpeg-devel

F and D extensions are included in all RISC-V application profiles ever
made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be
selected at compilation time.

Currently, there are no consumers for these two flags. If there is ever a
need to reintroduce F- or D-specific optimisations, we can always use
__riscv_f or __riscv_d compiler predefined macros respectively.
---
 libavutil/cpu.c           |  2 --
 libavutil/riscv/cpu.c     | 12 ------------
 libavutil/tests/cpu.c     |  2 --
 tests/checkasm/checkasm.c |  2 --
 4 files changed, 18 deletions(-)

diff --git a/libavutil/cpu.c b/libavutil/cpu.c
index 17afe8858a..6c26182b78 100644
--- a/libavutil/cpu.c
+++ b/libavutil/cpu.c
@@ -184,8 +184,6 @@ int av_parse_cpu_caps(unsigned *flags, const char *s)
         { "lasx",     NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX     },    .unit = "flags" },
 #elif ARCH_RISCV
         { "rvi",      NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI      },    .unit = "flags" },
-        { "rvf",      NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF      },    .unit = "flags" },
-        { "rvd",      NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD      },    .unit = "flags" },
         { "rvb",      NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVB      },    .unit = "flags" },
         { "zve32x",   NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_I32  },    .unit = "flags" },
         { "zve32f",   NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_F32  },    .unit = "flags" },
diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c
index e035f4b024..6537e91965 100644
--- a/libavutil/riscv/cpu.c
+++ b/libavutil/riscv/cpu.c
@@ -58,8 +58,6 @@ int ff_get_cpu_flags_riscv(void)
     if (__riscv_hwprobe(pairs, FF_ARRAY_ELEMS(pairs), 0, NULL, 0) == 0) {
         if (pairs[0].value & RISCV_HWPROBE_BASE_BEHAVIOR_IMA)
             ret |= AV_CPU_FLAG_RVI;
-        if (pairs[1].value & RISCV_HWPROBE_IMA_FD)
-            ret |= AV_CPU_FLAG_RVF | AV_CPU_FLAG_RVD;
 #ifdef RISCV_HWPROBE_IMA_V
         if (pairs[1].value & RISCV_HWPROBE_IMA_V)
             ret |= AV_CPU_FLAG_RVV_I32 | AV_CPU_FLAG_RVV_I64
@@ -96,10 +94,6 @@ int ff_get_cpu_flags_riscv(void)
 
         if (hwcap & HWCAP_RV('I'))
             ret |= AV_CPU_FLAG_RVI;
-        if (hwcap & HWCAP_RV('F'))
-            ret |= AV_CPU_FLAG_RVF;
-        if (hwcap & HWCAP_RV('D'))
-            ret |= AV_CPU_FLAG_RVD;
         if (hwcap & HWCAP_RV('B'))
             ret |= AV_CPU_FLAG_RVB_ADDR | AV_CPU_FLAG_RVB_BASIC |
                    AV_CPU_FLAG_RVB;
@@ -114,12 +108,6 @@ int ff_get_cpu_flags_riscv(void)
 #ifdef __riscv_i
     ret |= AV_CPU_FLAG_RVI;
 #endif
-#if defined (__riscv_flen) && (__riscv_flen >= 32)
-    ret |= AV_CPU_FLAG_RVF;
-#if (__riscv_flen >= 64)
-    ret |= AV_CPU_FLAG_RVD;
-#endif
-#endif
 
 #ifdef __riscv_zba
     ret |= AV_CPU_FLAG_RVB_ADDR;
diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c
index b4b11775d8..e03fbf94eb 100644
--- a/libavutil/tests/cpu.c
+++ b/libavutil/tests/cpu.c
@@ -86,8 +86,6 @@ static const struct {
     { AV_CPU_FLAG_LASX,      "lasx"       },
 #elif ARCH_RISCV
     { AV_CPU_FLAG_RVI,       "rvi"        },
-    { AV_CPU_FLAG_RVF,       "rvf"        },
-    { AV_CPU_FLAG_RVD,       "rvd"        },
     { AV_CPU_FLAG_RVB_ADDR,  "zba"        },
     { AV_CPU_FLAG_RVB_BASIC, "zbb"        },
     { AV_CPU_FLAG_RVB,       "rvb"        },
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 016f2329b0..49b47f8615 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -291,8 +291,6 @@ static const struct {
 #elif ARCH_RISCV
     { "RVI",      "rvi",      AV_CPU_FLAG_RVI },
     { "misaligned", "misaligned", AV_CPU_FLAG_RV_MISALIGNED },
-    { "RVF",      "rvf",      AV_CPU_FLAG_RVF },
-    { "RVD",      "rvd",      AV_CPU_FLAG_RVD },
     { "RVBaddr",  "rvb_a",    AV_CPU_FLAG_RVB_ADDR },
     { "RVBbasic", "rvb_b",    AV_CPU_FLAG_RVB_BASIC },
     { "RVB",      "rvb",      AV_CPU_FLAG_RVB },
-- 
2.45.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
  2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
                   ` (3 preceding siblings ...)
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
  2024-07-26  9:16   ` Andreas Rheinhardt
  2024-07-26  6:23 ` [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
  5 siblings, 1 reply; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
  To: ffmpeg-devel

---
 doc/APIchanges      | 3 +++
 libavutil/cpu.h     | 3 +++
 libavutil/version.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/doc/APIchanges b/doc/APIchanges
index fb54c3fbc9..16993d310e 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07
 
 API changes, most recent first:
 
+2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
+  Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
+
 2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
   Add AV_CPU_FLAG_RVB.
 
diff --git a/libavutil/cpu.h b/libavutil/cpu.h
index 9f419aae02..8af1233e6f 100644
--- a/libavutil/cpu.h
+++ b/libavutil/cpu.h
@@ -22,6 +22,7 @@
 #define AVUTIL_CPU_H
 
 #include <stddef.h>
+#include <libavutil/version.h>
 
 #define AV_CPU_FLAG_FORCE    0x80000000 /* force usage of selected flags (OR) */
 
@@ -82,8 +83,10 @@
 
 // RISC-V extensions
 #define AV_CPU_FLAG_RVI          (1 << 0) ///< I (full GPR bank)
+#if FF_API_RISCV_FD
 #define AV_CPU_FLAG_RVF          (1 << 1) ///< F (single precision FP)
 #define AV_CPU_FLAG_RVD          (1 << 2) ///< D (double precision FP)
+#endif
 #define AV_CPU_FLAG_RVV_I32      (1 << 3) ///< Vectors of 8/16/32-bit int's */
 #define AV_CPU_FLAG_RVV_F32      (1 << 4) ///< Vectors of float's */
 #define AV_CPU_FLAG_RVV_I64      (1 << 5) ///< Vectors of 64-bit int's */
diff --git a/libavutil/version.h b/libavutil/version.h
index 852eeef1d6..df43dcc321 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -113,6 +113,7 @@
 #define FF_API_VULKAN_CONTIGUOUS_MEMORY (LIBAVUTIL_VERSION_MAJOR < 60)
 #define FF_API_H274_FILM_GRAIN_VCS      (LIBAVUTIL_VERSION_MAJOR < 60)
 #define FF_API_MOD_UINTP2               (LIBAVUTIL_VERSION_MAJOR < 60)
+#define FF_API_RISCV_FD                 (LIBAVUTIL_VERSION_MAJOR < 60)
 
 /**
  * @}
-- 
2.45.2

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips
  2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
                   ` (4 preceding siblings ...)
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
@ 2024-07-26  6:23 ` Rémi Denis-Courmont
  5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-26  6:23 UTC (permalink / raw)
  To: FFmpeg development discussions and patches



Le 25 juillet 2024 23:25:15 GMT+03:00, "Rémi Denis-Courmont" <remi@remlab.net> a écrit :
>Unlike x86, fmin/fmax are single instructions, not function calls. They
>are much much faster than doing a comparison, then branching based on its
>results. With this, audiodsp.vector_clipf gets almost twice as fast, and
>a properly unrollled version of it gets 4-5x faster, on SiFive-U74.
>This is only the low-hanging fruit: FFMIN and FFMAX are presumably
>affected as well.
>
>This likely applies to other instruction sets with native IEEE floats,
>especially those lacking a conditional select instruction.

In fact, the same problem occurs on Armv8, and it gets even worse on Armv7 where FFMIN and FFMAX incur calls to fcmp*(). I am not sure if this really works on anything but x86.

Only way to make FFMIN behave as well as math.h functions seems to be enabling -funsafe-math-optimizations -ffinite-math-only.


>---
> libavutil/riscv/intmath.h | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
>diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h
>index 3e7ab864c5..24f165eef1 100644
>--- a/libavutil/riscv/intmath.h
>+++ b/libavutil/riscv/intmath.h
>@@ -22,6 +22,7 @@
> #define AVUTIL_RISCV_INTMATH_H
> 
> #include <stdint.h>
>+#include <math.h>
> 
> #include "config.h"
> #include "libavutil/attributes.h"
>@@ -72,6 +73,24 @@ static av_always_inline av_const int av_clip_intp2_rvi(int a, int p)
>     return b;
> }
> 
>+#if defined (__riscv_f) || defined (__riscv_zfinx)
>+#define av_clipf av_clipf_rvf
>+static av_always_inline av_const float av_clipf_rvf(float a, float min,
>+                                                    float max)
>+{
>+    return fminf(fmaxf(a, min), max);
>+}
>+#endif
>+
>+#if defined (__riscv_d) || defined (__riscv_zdinx)
>+#define av_clipd av_clipd_rvd
>+static av_always_inline av_const float av_clipd_rvd(double a, double min,
>+                                                    double max)
>+{
>+    return fmin(fmax(a, min), max);
>+}
>+#endif
>+
> #if defined (__GNUC__) || defined (__clang__)
> static inline av_const int ff_ctz_rv(int x)
> {
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
  2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
@ 2024-07-26  9:16   ` Andreas Rheinhardt
  2024-07-27 12:22     ` Rémi Denis-Courmont
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Rheinhardt @ 2024-07-26  9:16 UTC (permalink / raw)
  To: ffmpeg-devel

Rémi Denis-Courmont:
> ---
>  doc/APIchanges      | 3 +++
>  libavutil/cpu.h     | 3 +++
>  libavutil/version.h | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/doc/APIchanges b/doc/APIchanges
> index fb54c3fbc9..16993d310e 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07
>  
>  API changes, most recent first:
>  
> +2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
> +  Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
> +
>  2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
>    Add AV_CPU_FLAG_RVB.
>  
> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> index 9f419aae02..8af1233e6f 100644
> --- a/libavutil/cpu.h
> +++ b/libavutil/cpu.h
> @@ -22,6 +22,7 @@
>  #define AVUTIL_CPU_H
>  
>  #include <stddef.h>
> +#include <libavutil/version.h>

"version.h"

>  
>  #define AV_CPU_FLAG_FORCE    0x80000000 /* force usage of selected flags (OR) */
>  
> @@ -82,8 +83,10 @@
>  
>  // RISC-V extensions
>  #define AV_CPU_FLAG_RVI          (1 << 0) ///< I (full GPR bank)
> +#if FF_API_RISCV_FD
>  #define AV_CPU_FLAG_RVF          (1 << 1) ///< F (single precision FP)
>  #define AV_CPU_FLAG_RVD          (1 << 2) ///< D (double precision FP)
> +#endif
>  #define AV_CPU_FLAG_RVV_I32      (1 << 3) ///< Vectors of 8/16/32-bit int's */
>  #define AV_CPU_FLAG_RVV_F32      (1 << 4) ///< Vectors of float's */
>  #define AV_CPU_FLAG_RVV_I64      (1 << 5) ///< Vectors of 64-bit int's */
> diff --git a/libavutil/version.h b/libavutil/version.h
> index 852eeef1d6..df43dcc321 100644
> --- a/libavutil/version.h
> +++ b/libavutil/version.h
> @@ -113,6 +113,7 @@
>  #define FF_API_VULKAN_CONTIGUOUS_MEMORY (LIBAVUTIL_VERSION_MAJOR < 60)
>  #define FF_API_H274_FILM_GRAIN_VCS      (LIBAVUTIL_VERSION_MAJOR < 60)
>  #define FF_API_MOD_UINTP2               (LIBAVUTIL_VERSION_MAJOR < 60)
> +#define FF_API_RISCV_FD                 (LIBAVUTIL_VERSION_MAJOR < 60)
>  
>  /**
>   * @}

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
  2024-07-26  9:16   ` Andreas Rheinhardt
@ 2024-07-27 12:22     ` Rémi Denis-Courmont
  2024-07-27 12:27       ` Rémi Denis-Courmont
  0 siblings, 1 reply; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-27 12:22 UTC (permalink / raw)
  To: ffmpeg-devel

Le perjantaina 26. heinäkuuta 2024, 12.16.11 EEST Andreas Rheinhardt a écrit :
> Rémi Denis-Courmont:
> > ---
> > 
> >  doc/APIchanges      | 3 +++
> >  libavutil/cpu.h     | 3 +++
> >  libavutil/version.h | 1 +
> >  3 files changed, 7 insertions(+)
> > 
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index fb54c3fbc9..16993d310e 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -2,6 +2,9 @@ The last version increases of all libraries were on
> > 2024-03-07> 
> >  API changes, most recent first:
> > +2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
> > +  Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
> > +
> > 
> >  2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
> >  
> >    Add AV_CPU_FLAG_RVB.
> > 
> > diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> > index 9f419aae02..8af1233e6f 100644
> > --- a/libavutil/cpu.h
> > +++ b/libavutil/cpu.h
> > @@ -22,6 +22,7 @@
> > 
> >  #define AVUTIL_CPU_H
> >  
> >  #include <stddef.h>
> > 
> > +#include <libavutil/version.h>
> 
> "version.h"

Fixed locally.

-- 
雷米‧德尼-库尔蒙
http://www.remlab.net/



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
  2024-07-27 12:22     ` Rémi Denis-Courmont
@ 2024-07-27 12:27       ` Rémi Denis-Courmont
  0 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-27 12:27 UTC (permalink / raw)
  To: ffmpeg-devel

Le lauantaina 27. heinäkuuta 2024, 15.22.27 EEST Rémi Denis-Courmont a écrit :
> Le perjantaina 26. heinäkuuta 2024, 12.16.11 EEST Andreas Rheinhardt a écrit 
:
> > Rémi Denis-Courmont:
> > > ---
> > > 
> > >  doc/APIchanges      | 3 +++
> > >  libavutil/cpu.h     | 3 +++
> > >  libavutil/version.h | 1 +
> > >  3 files changed, 7 insertions(+)
> > > 
> > > diff --git a/doc/APIchanges b/doc/APIchanges
> > > index fb54c3fbc9..16993d310e 100644
> > > --- a/doc/APIchanges
> > > +++ b/doc/APIchanges
> > > @@ -2,6 +2,9 @@ The last version increases of all libraries were on
> > > 2024-03-07>
> > > 
> > >  API changes, most recent first:
> > > +2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
> > > +  Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
> > > +
> > > 
> > >  2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
> > >  
> > >    Add AV_CPU_FLAG_RVB.
> > > 
> > > diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> > > index 9f419aae02..8af1233e6f 100644
> > > --- a/libavutil/cpu.h
> > > +++ b/libavutil/cpu.h
> > > @@ -22,6 +22,7 @@
> > > 
> > >  #define AVUTIL_CPU_H
> > >  
> > >  #include <stddef.h>
> > > 
> > > +#include <libavutil/version.h>
> > 
> > "version.h"
> 
> Fixed locally.

Scratch that, the patch is wrong anyway. Will drop it from the series for now.

-- 
レミ・デニ-クールモン
http://www.remlab.net/



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-07-27 12:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
2024-07-26  9:16   ` Andreas Rheinhardt
2024-07-27 12:22     ` Rémi Denis-Courmont
2024-07-27 12:27       ` Rémi Denis-Courmont
2024-07-26  6:23 ` [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont

Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
		ffmpegdev@gitmailbox.com
	public-inbox-index ffmpegdev

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git