* [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips
@ 2024-07-25 20:25 Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
To: ffmpeg-devel
Unlike x86, fmin/fmax are single instructions, not function calls. They
are much much faster than doing a comparison, then branching based on its
results. With this, audiodsp.vector_clipf gets almost twice as fast, and
a properly unrollled version of it gets 4-5x faster, on SiFive-U74.
This is only the low-hanging fruit: FFMIN and FFMAX are presumably
affected as well.
This likely applies to other instruction sets with native IEEE floats,
especially those lacking a conditional select instruction.
---
libavutil/riscv/intmath.h | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h
index 3e7ab864c5..24f165eef1 100644
--- a/libavutil/riscv/intmath.h
+++ b/libavutil/riscv/intmath.h
@@ -22,6 +22,7 @@
#define AVUTIL_RISCV_INTMATH_H
#include <stdint.h>
+#include <math.h>
#include "config.h"
#include "libavutil/attributes.h"
@@ -72,6 +73,24 @@ static av_always_inline av_const int av_clip_intp2_rvi(int a, int p)
return b;
}
+#if defined (__riscv_f) || defined (__riscv_zfinx)
+#define av_clipf av_clipf_rvf
+static av_always_inline av_const float av_clipf_rvf(float a, float min,
+ float max)
+{
+ return fminf(fmaxf(a, min), max);
+}
+#endif
+
+#if defined (__riscv_d) || defined (__riscv_zdinx)
+#define av_clipd av_clipd_rvd
+static av_always_inline av_const float av_clipd_rvd(double a, double min,
+ double max)
+{
+ return fmin(fmax(a, min), max);
+}
+#endif
+
#if defined (__GNUC__) || defined (__clang__)
static inline av_const int ff_ctz_rv(int x)
{
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation Rémi Denis-Courmont
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
To: ffmpeg-devel
Given that source and destination can alias, the compiler was forced to
perform each read-modify-write sequentially. We cannot use the `restrict`
qualifier to avoid this here because the AC-3 encoder uses the function
in-place. Instead this commit provides an explicit guarantee to the
compiler that batches of 8 elements will not overlap, so that it can
interleave calculations.
In practice contemporary optimising compilers are able to unroll and keep
the temporary array in FPU registers (without spilling).
On SiFive-U74, this speeds the same signs branch by 4x, and the
opposite signs branch 1.5x.
---
libavcodec/audiodsp.c | 40 +++++++++++++++++-----------------------
1 file changed, 17 insertions(+), 23 deletions(-)
diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c
index c5427d3535..9e83f06aaa 100644
--- a/libavcodec/audiodsp.c
+++ b/libavcodec/audiodsp.c
@@ -38,41 +38,35 @@ static inline float clipf_c_one(float a, uint32_t mini,
static void vector_clipf_c_opposite_sign(float *dst, const float *src,
float min, float max, int len)
{
- int i;
uint32_t mini = av_float2int(min);
uint32_t maxi = av_float2int(max);
uint32_t maxisign = maxi ^ (1U << 31);
- for (i = 0; i < len; i += 8) {
- dst[i + 0] = clipf_c_one(src[i + 0], mini, maxi, maxisign);
- dst[i + 1] = clipf_c_one(src[i + 1], mini, maxi, maxisign);
- dst[i + 2] = clipf_c_one(src[i + 2], mini, maxi, maxisign);
- dst[i + 3] = clipf_c_one(src[i + 3], mini, maxi, maxisign);
- dst[i + 4] = clipf_c_one(src[i + 4], mini, maxi, maxisign);
- dst[i + 5] = clipf_c_one(src[i + 5], mini, maxi, maxisign);
- dst[i + 6] = clipf_c_one(src[i + 6], mini, maxi, maxisign);
- dst[i + 7] = clipf_c_one(src[i + 7], mini, maxi, maxisign);
+ for (int i = 0; i < len; i += 8) {
+ float tmp[8];
+
+ for (int j = 0; j < 8; j++)
+ tmp[j]= clipf_c_one(src[i + j], mini, maxi, maxisign);
+ for (int j = 0; j < 8; j++)
+ dst[i + j] = tmp[j];
}
}
static void vector_clipf_c(float *dst, const float *src, int len,
float min, float max)
{
- int i;
-
if (min < 0 && max > 0) {
vector_clipf_c_opposite_sign(dst, src, min, max, len);
- } else {
- for (i = 0; i < len; i += 8) {
- dst[i] = av_clipf(src[i], min, max);
- dst[i + 1] = av_clipf(src[i + 1], min, max);
- dst[i + 2] = av_clipf(src[i + 2], min, max);
- dst[i + 3] = av_clipf(src[i + 3], min, max);
- dst[i + 4] = av_clipf(src[i + 4], min, max);
- dst[i + 5] = av_clipf(src[i + 5], min, max);
- dst[i + 6] = av_clipf(src[i + 6], min, max);
- dst[i + 7] = av_clipf(src[i + 7], min, max);
- }
+ return;
+ }
+
+ for (int i = 0; i < len; i += 8) {
+ float tmp[8];
+
+ for (int j = 0; j < 8; j++)
+ tmp[j]= av_clipf(src[i + j], min, max);
+ for (int j = 0; j < 8; j++)
+ dst[i + j] = tmp[j];
}
}
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf Rémi Denis-Courmont
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
To: ffmpeg-devel
This was added along side the original SSE(one) DSP function in
0a68cd876e14f76a00df7bb8edbfeb350f8ef617 without rationale. This was
presumably faster on x87, which is no longer relevant since we pretty
much assume SSE2 or later on x86.
Meanwhile this function is ~2.5x slower than the normal floating point
one on SiFive-U74.
---
libavcodec/audiodsp.c | 35 -----------------------------------
1 file changed, 35 deletions(-)
diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c
index 9e83f06aaa..fd6a00345f 100644
--- a/libavcodec/audiodsp.c
+++ b/libavcodec/audiodsp.c
@@ -22,44 +22,9 @@
#include "libavutil/common.h"
#include "audiodsp.h"
-static inline float clipf_c_one(float a, uint32_t mini,
- uint32_t maxi, uint32_t maxisign)
-{
- uint32_t ai = av_float2int(a);
-
- if (ai > mini)
- return av_int2float(mini);
- else if ((ai ^ (1U << 31)) > maxisign)
- return av_int2float(maxi);
- else
- return a;
-}
-
-static void vector_clipf_c_opposite_sign(float *dst, const float *src,
- float min, float max, int len)
-{
- uint32_t mini = av_float2int(min);
- uint32_t maxi = av_float2int(max);
- uint32_t maxisign = maxi ^ (1U << 31);
-
- for (int i = 0; i < len; i += 8) {
- float tmp[8];
-
- for (int j = 0; j < 8; j++)
- tmp[j]= clipf_c_one(src[i + j], mini, maxi, maxisign);
- for (int j = 0; j < 8; j++)
- dst[i + j] = tmp[j];
- }
-}
-
static void vector_clipf_c(float *dst, const float *src, int len,
float min, float max)
{
- if (min < 0 && max > 0) {
- vector_clipf_c_opposite_sign(dst, src, min, max, len);
- return;
- }
-
for (int i = 0; i < len; i += 8) {
float tmp[8];
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions Rémi Denis-Courmont
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
To: ffmpeg-devel
This is now firmly slower than C.
SiFive-U74 (cycles):
audiodsp.vector_clipf_c: 31.2
audiodsp.vector_clipf_rvf: 39.5
---
libavcodec/riscv/Makefile | 1 -
libavcodec/riscv/audiodsp_init.c | 8 +----
libavcodec/riscv/audiodsp_rvf.S | 50 --------------------------------
3 files changed, 1 insertion(+), 58 deletions(-)
delete mode 100644 libavcodec/riscv/audiodsp_rvf.S
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index 0bbdd38116..621c099a5b 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -9,7 +9,6 @@ RVVB-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvvb.o
OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o
RVV-OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_rvv.o
OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o
-RV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvf.o
RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
OBJS-$(CONFIG_BLOCKDSP) += riscv/blockdsp_init.o
RVV-OBJS-$(CONFIG_BLOCKDSP) += riscv/blockdsp_rvv.o
diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c
index f606406429..5750d4d8a7 100644
--- a/libavcodec/riscv/audiodsp_init.c
+++ b/libavcodec/riscv/audiodsp_init.c
@@ -24,8 +24,6 @@
#include "libavutil/cpu.h"
#include "libavcodec/audiodsp.h"
-void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max);
-
int32_t ff_scalarproduct_int16_rvv(const int16_t *v1, const int16_t *v2, int len);
void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min,
int32_t max, unsigned int len);
@@ -33,12 +31,9 @@ void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float
av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c)
{
-#if HAVE_RV
+#if HAVE_RVV
int flags = av_get_cpu_flags();
- if (flags & AV_CPU_FLAG_RVF)
- c->vector_clipf = ff_vector_clipf_rvf;
-#if HAVE_RVV
if (flags & AV_CPU_FLAG_RVB_ADDR) {
if (flags & AV_CPU_FLAG_RVV_I32) {
c->scalarproduct_int16 = ff_scalarproduct_int16_rvv;
@@ -48,5 +43,4 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c)
c->vector_clipf = ff_vector_clipf_rvv;
}
#endif
-#endif
}
diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S
deleted file mode 100644
index 97aa930ab5..0000000000
--- a/libavcodec/riscv/audiodsp_rvf.S
+++ /dev/null
@@ -1,50 +0,0 @@
-/*
- * Copyright © 2022 Rémi Denis-Courmont.
- *
- * This file is part of FFmpeg.
- *
- * FFmpeg is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- *
- * FFmpeg is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with FFmpeg; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
- */
-
-#include "libavutil/riscv/asm.S"
-
-func ff_vector_clipf_rvf, f
- lpad 0
-NOHWF fmv.w.x fa0, a3
-NOHWF fmv.w.x fa1, a4
-1:
- flw ft0, (a1)
- flw ft1, 4(a1)
- fmax.s ft0, ft0, fa0
- flw ft2, 8(a1)
- fmax.s ft1, ft1, fa0
- flw ft3, 12(a1)
- fmax.s ft2, ft2, fa0
- addi a2, a2, -4
- fmax.s ft3, ft3, fa0
- addi a1, a1, 16
- fmin.s ft0, ft0, fa1
- fmin.s ft1, ft1, fa1
- fsw ft0, (a0)
- fmin.s ft2, ft2, fa1
- fsw ft1, 4(a0)
- fmin.s ft3, ft3, fa1
- fsw ft2, 8(a0)
- fsw ft3, 12(a0)
- addi a0, a0, 16
- bnez a2, 1b
-
- ret
-endfunc
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
` (2 preceding siblings ...)
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
2024-07-26 6:23 ` [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
To: ffmpeg-devel
F and D extensions are included in all RISC-V application profiles ever
made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be
selected at compilation time.
Currently, there are no consumers for these two flags. If there is ever a
need to reintroduce F- or D-specific optimisations, we can always use
__riscv_f or __riscv_d compiler predefined macros respectively.
---
libavutil/cpu.c | 2 --
libavutil/riscv/cpu.c | 12 ------------
libavutil/tests/cpu.c | 2 --
tests/checkasm/checkasm.c | 2 --
4 files changed, 18 deletions(-)
diff --git a/libavutil/cpu.c b/libavutil/cpu.c
index 17afe8858a..6c26182b78 100644
--- a/libavutil/cpu.c
+++ b/libavutil/cpu.c
@@ -184,8 +184,6 @@ int av_parse_cpu_caps(unsigned *flags, const char *s)
{ "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" },
#elif ARCH_RISCV
{ "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" },
- { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" },
- { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" },
{ "rvb", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVB }, .unit = "flags" },
{ "zve32x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_I32 }, .unit = "flags" },
{ "zve32f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_F32 }, .unit = "flags" },
diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c
index e035f4b024..6537e91965 100644
--- a/libavutil/riscv/cpu.c
+++ b/libavutil/riscv/cpu.c
@@ -58,8 +58,6 @@ int ff_get_cpu_flags_riscv(void)
if (__riscv_hwprobe(pairs, FF_ARRAY_ELEMS(pairs), 0, NULL, 0) == 0) {
if (pairs[0].value & RISCV_HWPROBE_BASE_BEHAVIOR_IMA)
ret |= AV_CPU_FLAG_RVI;
- if (pairs[1].value & RISCV_HWPROBE_IMA_FD)
- ret |= AV_CPU_FLAG_RVF | AV_CPU_FLAG_RVD;
#ifdef RISCV_HWPROBE_IMA_V
if (pairs[1].value & RISCV_HWPROBE_IMA_V)
ret |= AV_CPU_FLAG_RVV_I32 | AV_CPU_FLAG_RVV_I64
@@ -96,10 +94,6 @@ int ff_get_cpu_flags_riscv(void)
if (hwcap & HWCAP_RV('I'))
ret |= AV_CPU_FLAG_RVI;
- if (hwcap & HWCAP_RV('F'))
- ret |= AV_CPU_FLAG_RVF;
- if (hwcap & HWCAP_RV('D'))
- ret |= AV_CPU_FLAG_RVD;
if (hwcap & HWCAP_RV('B'))
ret |= AV_CPU_FLAG_RVB_ADDR | AV_CPU_FLAG_RVB_BASIC |
AV_CPU_FLAG_RVB;
@@ -114,12 +108,6 @@ int ff_get_cpu_flags_riscv(void)
#ifdef __riscv_i
ret |= AV_CPU_FLAG_RVI;
#endif
-#if defined (__riscv_flen) && (__riscv_flen >= 32)
- ret |= AV_CPU_FLAG_RVF;
-#if (__riscv_flen >= 64)
- ret |= AV_CPU_FLAG_RVD;
-#endif
-#endif
#ifdef __riscv_zba
ret |= AV_CPU_FLAG_RVB_ADDR;
diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c
index b4b11775d8..e03fbf94eb 100644
--- a/libavutil/tests/cpu.c
+++ b/libavutil/tests/cpu.c
@@ -86,8 +86,6 @@ static const struct {
{ AV_CPU_FLAG_LASX, "lasx" },
#elif ARCH_RISCV
{ AV_CPU_FLAG_RVI, "rvi" },
- { AV_CPU_FLAG_RVF, "rvf" },
- { AV_CPU_FLAG_RVD, "rvd" },
{ AV_CPU_FLAG_RVB_ADDR, "zba" },
{ AV_CPU_FLAG_RVB_BASIC, "zbb" },
{ AV_CPU_FLAG_RVB, "rvb" },
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 016f2329b0..49b47f8615 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -291,8 +291,6 @@ static const struct {
#elif ARCH_RISCV
{ "RVI", "rvi", AV_CPU_FLAG_RVI },
{ "misaligned", "misaligned", AV_CPU_FLAG_RV_MISALIGNED },
- { "RVF", "rvf", AV_CPU_FLAG_RVF },
- { "RVD", "rvd", AV_CPU_FLAG_RVD },
{ "RVBaddr", "rvb_a", AV_CPU_FLAG_RVB_ADDR },
{ "RVBbasic", "rvb_b", AV_CPU_FLAG_RVB_BASIC },
{ "RVB", "rvb", AV_CPU_FLAG_RVB },
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
` (3 preceding siblings ...)
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions Rémi Denis-Courmont
@ 2024-07-25 20:25 ` Rémi Denis-Courmont
2024-07-26 9:16 ` Andreas Rheinhardt
2024-07-26 6:23 ` [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
5 siblings, 1 reply; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-25 20:25 UTC (permalink / raw)
To: ffmpeg-devel
---
doc/APIchanges | 3 +++
libavutil/cpu.h | 3 +++
libavutil/version.h | 1 +
3 files changed, 7 insertions(+)
diff --git a/doc/APIchanges b/doc/APIchanges
index fb54c3fbc9..16993d310e 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07
API changes, most recent first:
+2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
+ Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
+
2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
Add AV_CPU_FLAG_RVB.
diff --git a/libavutil/cpu.h b/libavutil/cpu.h
index 9f419aae02..8af1233e6f 100644
--- a/libavutil/cpu.h
+++ b/libavutil/cpu.h
@@ -22,6 +22,7 @@
#define AVUTIL_CPU_H
#include <stddef.h>
+#include <libavutil/version.h>
#define AV_CPU_FLAG_FORCE 0x80000000 /* force usage of selected flags (OR) */
@@ -82,8 +83,10 @@
// RISC-V extensions
#define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank)
+#if FF_API_RISCV_FD
#define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP)
#define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP)
+#endif
#define AV_CPU_FLAG_RVV_I32 (1 << 3) ///< Vectors of 8/16/32-bit int's */
#define AV_CPU_FLAG_RVV_F32 (1 << 4) ///< Vectors of float's */
#define AV_CPU_FLAG_RVV_I64 (1 << 5) ///< Vectors of 64-bit int's */
diff --git a/libavutil/version.h b/libavutil/version.h
index 852eeef1d6..df43dcc321 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -113,6 +113,7 @@
#define FF_API_VULKAN_CONTIGUOUS_MEMORY (LIBAVUTIL_VERSION_MAJOR < 60)
#define FF_API_H274_FILM_GRAIN_VCS (LIBAVUTIL_VERSION_MAJOR < 60)
#define FF_API_MOD_UINTP2 (LIBAVUTIL_VERSION_MAJOR < 60)
+#define FF_API_RISCV_FD (LIBAVUTIL_VERSION_MAJOR < 60)
/**
* @}
--
2.45.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
` (4 preceding siblings ...)
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
@ 2024-07-26 6:23 ` Rémi Denis-Courmont
5 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-26 6:23 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 25 juillet 2024 23:25:15 GMT+03:00, "Rémi Denis-Courmont" <remi@remlab.net> a écrit :
>Unlike x86, fmin/fmax are single instructions, not function calls. They
>are much much faster than doing a comparison, then branching based on its
>results. With this, audiodsp.vector_clipf gets almost twice as fast, and
>a properly unrollled version of it gets 4-5x faster, on SiFive-U74.
>This is only the low-hanging fruit: FFMIN and FFMAX are presumably
>affected as well.
>
>This likely applies to other instruction sets with native IEEE floats,
>especially those lacking a conditional select instruction.
In fact, the same problem occurs on Armv8, and it gets even worse on Armv7 where FFMIN and FFMAX incur calls to fcmp*(). I am not sure if this really works on anything but x86.
Only way to make FFMIN behave as well as math.h functions seems to be enabling -funsafe-math-optimizations -ffinite-math-only.
>---
> libavutil/riscv/intmath.h | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
>diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h
>index 3e7ab864c5..24f165eef1 100644
>--- a/libavutil/riscv/intmath.h
>+++ b/libavutil/riscv/intmath.h
>@@ -22,6 +22,7 @@
> #define AVUTIL_RISCV_INTMATH_H
>
> #include <stdint.h>
>+#include <math.h>
>
> #include "config.h"
> #include "libavutil/attributes.h"
>@@ -72,6 +73,24 @@ static av_always_inline av_const int av_clip_intp2_rvi(int a, int p)
> return b;
> }
>
>+#if defined (__riscv_f) || defined (__riscv_zfinx)
>+#define av_clipf av_clipf_rvf
>+static av_always_inline av_const float av_clipf_rvf(float a, float min,
>+ float max)
>+{
>+ return fminf(fmaxf(a, min), max);
>+}
>+#endif
>+
>+#if defined (__riscv_d) || defined (__riscv_zdinx)
>+#define av_clipd av_clipd_rvd
>+static av_always_inline av_const float av_clipd_rvd(double a, double min,
>+ double max)
>+{
>+ return fmin(fmax(a, min), max);
>+}
>+#endif
>+
> #if defined (__GNUC__) || defined (__clang__)
> static inline av_const int ff_ctz_rv(int x)
> {
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
@ 2024-07-26 9:16 ` Andreas Rheinhardt
2024-07-27 12:22 ` Rémi Denis-Courmont
0 siblings, 1 reply; 10+ messages in thread
From: Andreas Rheinhardt @ 2024-07-26 9:16 UTC (permalink / raw)
To: ffmpeg-devel
Rémi Denis-Courmont:
> ---
> doc/APIchanges | 3 +++
> libavutil/cpu.h | 3 +++
> libavutil/version.h | 1 +
> 3 files changed, 7 insertions(+)
>
> diff --git a/doc/APIchanges b/doc/APIchanges
> index fb54c3fbc9..16993d310e 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07
>
> API changes, most recent first:
>
> +2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
> + Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
> +
> 2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
> Add AV_CPU_FLAG_RVB.
>
> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> index 9f419aae02..8af1233e6f 100644
> --- a/libavutil/cpu.h
> +++ b/libavutil/cpu.h
> @@ -22,6 +22,7 @@
> #define AVUTIL_CPU_H
>
> #include <stddef.h>
> +#include <libavutil/version.h>
"version.h"
>
> #define AV_CPU_FLAG_FORCE 0x80000000 /* force usage of selected flags (OR) */
>
> @@ -82,8 +83,10 @@
>
> // RISC-V extensions
> #define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank)
> +#if FF_API_RISCV_FD
> #define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP)
> #define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP)
> +#endif
> #define AV_CPU_FLAG_RVV_I32 (1 << 3) ///< Vectors of 8/16/32-bit int's */
> #define AV_CPU_FLAG_RVV_F32 (1 << 4) ///< Vectors of float's */
> #define AV_CPU_FLAG_RVV_I64 (1 << 5) ///< Vectors of 64-bit int's */
> diff --git a/libavutil/version.h b/libavutil/version.h
> index 852eeef1d6..df43dcc321 100644
> --- a/libavutil/version.h
> +++ b/libavutil/version.h
> @@ -113,6 +113,7 @@
> #define FF_API_VULKAN_CONTIGUOUS_MEMORY (LIBAVUTIL_VERSION_MAJOR < 60)
> #define FF_API_H274_FILM_GRAIN_VCS (LIBAVUTIL_VERSION_MAJOR < 60)
> #define FF_API_MOD_UINTP2 (LIBAVUTIL_VERSION_MAJOR < 60)
> +#define FF_API_RISCV_FD (LIBAVUTIL_VERSION_MAJOR < 60)
>
> /**
> * @}
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
2024-07-26 9:16 ` Andreas Rheinhardt
@ 2024-07-27 12:22 ` Rémi Denis-Courmont
2024-07-27 12:27 ` Rémi Denis-Courmont
0 siblings, 1 reply; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-27 12:22 UTC (permalink / raw)
To: ffmpeg-devel
Le perjantaina 26. heinäkuuta 2024, 12.16.11 EEST Andreas Rheinhardt a écrit :
> Rémi Denis-Courmont:
> > ---
> >
> > doc/APIchanges | 3 +++
> > libavutil/cpu.h | 3 +++
> > libavutil/version.h | 1 +
> > 3 files changed, 7 insertions(+)
> >
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index fb54c3fbc9..16993d310e 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -2,6 +2,9 @@ The last version increases of all libraries were on
> > 2024-03-07>
> > API changes, most recent first:
> > +2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
> > + Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
> > +
> >
> > 2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
> >
> > Add AV_CPU_FLAG_RVB.
> >
> > diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> > index 9f419aae02..8af1233e6f 100644
> > --- a/libavutil/cpu.h
> > +++ b/libavutil/cpu.h
> > @@ -22,6 +22,7 @@
> >
> > #define AVUTIL_CPU_H
> >
> > #include <stddef.h>
> >
> > +#include <libavutil/version.h>
>
> "version.h"
Fixed locally.
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D}
2024-07-27 12:22 ` Rémi Denis-Courmont
@ 2024-07-27 12:27 ` Rémi Denis-Courmont
0 siblings, 0 replies; 10+ messages in thread
From: Rémi Denis-Courmont @ 2024-07-27 12:27 UTC (permalink / raw)
To: ffmpeg-devel
Le lauantaina 27. heinäkuuta 2024, 15.22.27 EEST Rémi Denis-Courmont a écrit :
> Le perjantaina 26. heinäkuuta 2024, 12.16.11 EEST Andreas Rheinhardt a écrit
:
> > Rémi Denis-Courmont:
> > > ---
> > >
> > > doc/APIchanges | 3 +++
> > > libavutil/cpu.h | 3 +++
> > > libavutil/version.h | 1 +
> > > 3 files changed, 7 insertions(+)
> > >
> > > diff --git a/doc/APIchanges b/doc/APIchanges
> > > index fb54c3fbc9..16993d310e 100644
> > > --- a/doc/APIchanges
> > > +++ b/doc/APIchanges
> > > @@ -2,6 +2,9 @@ The last version increases of all libraries were on
> > > 2024-03-07>
> > >
> > > API changes, most recent first:
> > > +2024-07-28 - xxxxxxxxx - lavu 59.30.101 - cpu.h
> > > + Deprecate AV_CPU_FLAG_RVF and AV_CPU_FLAG_RVD without replacement.
> > > +
> > >
> > > 2024-07-25 - xxxxxxxxx - lavu 59.29.100 - cpu.h
> > >
> > > Add AV_CPU_FLAG_RVB.
> > >
> > > diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> > > index 9f419aae02..8af1233e6f 100644
> > > --- a/libavutil/cpu.h
> > > +++ b/libavutil/cpu.h
> > > @@ -22,6 +22,7 @@
> > >
> > > #define AVUTIL_CPU_H
> > >
> > > #include <stddef.h>
> > >
> > > +#include <libavutil/version.h>
> >
> > "version.h"
>
> Fixed locally.
Scratch that, the patch is wrong anyway. Will drop it from the series for now.
--
レミ・デニ-クールモン
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-07-27 12:28 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-25 20:25 [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 2/6] lavc/audiodsp: properly unroll vector_clipf Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 3/6] lavc/audiodsp: drop opposite sign optimisation Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 4/6] lavc/audiodsp: drop R-V F vector_clipf Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 5/6] lavc/riscv: drop probing for F & D extensions Rémi Denis-Courmont
2024-07-25 20:25 ` [FFmpeg-devel] [PATCH 6/6] lavu/cpu: deprecate AV_CPU_FLAG_RV{F, D} Rémi Denis-Courmont
2024-07-26 9:16 ` Andreas Rheinhardt
2024-07-27 12:22 ` Rémi Denis-Courmont
2024-07-27 12:27 ` Rémi Denis-Courmont
2024-07-26 6:23 ` [FFmpeg-devel] [PATCH 1/6] lavu/riscv: implement floating point clips Rémi Denis-Courmont
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git