* [FFmpeg-devel] [PATCH 1/4] lavu/riscv: CPU flag for the Zbb extension
2022-10-02 11:54 [FFmpeg-devel] [PATCH 0/4] RISC-V initial bswapdsp Rémi Denis-Courmont
@ 2022-10-02 11:54 ` remi
2022-10-02 11:54 ` [FFmpeg-devel] [PATCH 2/4] lavc/bswapdsp: RISC-V B bswap_buf remi
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: remi @ 2022-10-02 11:54 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
Unfortunately, it is common, and will remain so, that the Bit
manipulations are not enabled at compilation time. This is an official
policy for Debian ports in general (though they do not support RISC-V
officially as of yet) to stick to the minimal target baseline, which
does not include the B extension or even its Zbb subset.
For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or
even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems
impractical. But at least it can work for the byte-swap DSP functions.
---
libavutil/cpu.c | 1 +
libavutil/cpu.h | 1 +
libavutil/riscv/cpu.c | 6 ++++++
tests/checkasm/checkasm.c | 1 +
4 files changed, 9 insertions(+)
diff --git a/libavutil/cpu.c b/libavutil/cpu.c
index 5818fd9c1c..2c5f7f4958 100644
--- a/libavutil/cpu.c
+++ b/libavutil/cpu.c
@@ -188,6 +188,7 @@ int av_parse_cpu_caps(unsigned *flags, const char *s)
{ "rvv-f32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_F32 }, .unit = "flags" },
{ "rvv-i64", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_I64 }, .unit = "flags" },
{ "rvv", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_F64 }, .unit = "flags" },
+ { "rvb-basic",NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVB_BASIC }, .unit = "flags" },
#endif
{ NULL },
};
diff --git a/libavutil/cpu.h b/libavutil/cpu.h
index 18f42af015..8fa5ea9199 100644
--- a/libavutil/cpu.h
+++ b/libavutil/cpu.h
@@ -86,6 +86,7 @@
#define AV_CPU_FLAG_RVV_F32 (1 << 4) ///< Vectors of float's */
#define AV_CPU_FLAG_RVV_I64 (1 << 5) ///< Vectors of 64-bit int's */
#define AV_CPU_FLAG_RVV_F64 (1 << 6) ///< Vectors of double's
+#define AV_CPU_FLAG_RVB_BASIC (1 << 7) ///< Basic bit-manipulations
/**
* Return the flags which specify extensions supported by the CPU.
diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c
index e234201395..a9263dbb78 100644
--- a/libavutil/riscv/cpu.c
+++ b/libavutil/riscv/cpu.c
@@ -40,6 +40,8 @@ int ff_get_cpu_flags_riscv(void)
ret |= AV_CPU_FLAG_RVF;
if (hwcap & HWCAP_RV('D'))
ret |= AV_CPU_FLAG_RVD;
+ if (hwcap & HWCAP_RV('B'))
+ ret |= AV_CPU_FLAG_RVB_BASIC;
/* The V extension implies all Zve* functional subsets */
if (hwcap & HWCAP_RV('V'))
@@ -57,6 +59,10 @@ int ff_get_cpu_flags_riscv(void)
#endif
#endif
+#ifdef __riscv_zbb
+ ret |= AV_CPU_FLAG_RVB_BASIC;
+#endif
+
/* If RV-V is enabled statically at compile-time, check the details. */
#ifdef __riscv_vectors
ret |= AV_CPU_FLAG_RVV_I32;
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 90dd7e4634..421bd096c5 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -240,6 +240,7 @@ static const struct {
{ "RVVf32", "rvv_f32", AV_CPU_FLAG_RVV_F32 },
{ "RVVi64", "rvv_i64", AV_CPU_FLAG_RVV_I64 },
{ "RVVf64", "rvv_f64", AV_CPU_FLAG_RVV_F64 },
+ { "RVBbasic", "rvb_b", AV_CPU_FLAG_RVB_BASIC },
#elif ARCH_MIPS
{ "MMI", "mmi", AV_CPU_FLAG_MMI },
{ "MSA", "msa", AV_CPU_FLAG_MSA },
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 6+ messages in thread
* [FFmpeg-devel] [PATCH 2/4] lavc/bswapdsp: RISC-V B bswap_buf
2022-10-02 11:54 [FFmpeg-devel] [PATCH 0/4] RISC-V initial bswapdsp Rémi Denis-Courmont
2022-10-02 11:54 ` [FFmpeg-devel] [PATCH 1/4] lavu/riscv: CPU flag for the Zbb extension remi
@ 2022-10-02 11:54 ` remi
2022-10-02 11:55 ` [FFmpeg-devel] [PATCH 3/4] lavc/bswapdsp: RISC-V V bswap_buf remi
2022-10-02 11:55 ` [FFmpeg-devel] [PATCH 4/4] lavc/bswapdsp: RISC-V V bswap16_buf remi
3 siblings, 0 replies; 6+ messages in thread
From: remi @ 2022-10-02 11:54 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
Simply taking the Zbb REV8 instruction into use in a simple loop gives
some significant savings:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 771.0
But we can also use the 64-bit REV8 as a pseudo-SIMD instruction with
just one additional shift, and one fewer load, effectively doubling the
bandwidth. Consequently, this patch is useful even if the compile-time
target has Zbb enabled for C code:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 341.0 (this patch)
On the other hand, this approach fails miserably for bswap16_buf as the
ratio of shifts and stores becomes unfavorable compared to naïve C:
bswap16_buf_c: 1542.0
bswap16_buf_rvb_b: 1803.7
Unrolling to process 128 bits (4 samples) at a time actually worsens
performance ever so slightly:
bswap_buf_c: 1081.0
bswap_buf_rvb_b: 408.5
---
libavcodec/bswapdsp.c | 4 +-
libavcodec/bswapdsp.h | 1 +
libavcodec/riscv/Makefile | 2 +
libavcodec/riscv/bswapdsp_init.c | 38 ++++++++++++++++++
libavcodec/riscv/bswapdsp_rvb.S | 68 ++++++++++++++++++++++++++++++++
5 files changed, 112 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/riscv/bswapdsp_init.c
create mode 100644 libavcodec/riscv/bswapdsp_rvb.S
diff --git a/libavcodec/bswapdsp.c b/libavcodec/bswapdsp.c
index 4c4ea10acc..f0ea2b55c5 100644
--- a/libavcodec/bswapdsp.c
+++ b/libavcodec/bswapdsp.c
@@ -51,7 +51,9 @@ av_cold void ff_bswapdsp_init(BswapDSPContext *c)
c->bswap_buf = bswap_buf;
c->bswap16_buf = bswap16_buf;
-#if ARCH_X86
+#if ARCH_RISCV
+ ff_bswapdsp_init_riscv(c);
+#elif ARCH_X86
ff_bswapdsp_init_x86(c);
#endif
}
diff --git a/libavcodec/bswapdsp.h b/libavcodec/bswapdsp.h
index 4d19092254..6f4db66115 100644
--- a/libavcodec/bswapdsp.h
+++ b/libavcodec/bswapdsp.h
@@ -27,6 +27,7 @@ typedef struct BswapDSPContext {
} BswapDSPContext;
void ff_bswapdsp_init(BswapDSPContext *c);
+void ff_bswapdsp_init_riscv(BswapDSPContext *c);
void ff_bswapdsp_init_x86(BswapDSPContext *c);
#endif /* AVCODEC_BSWAPDSP_H */
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index 0fb2c81c75..db4384bca7 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -3,6 +3,8 @@ RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o
OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
riscv/audiodsp_rvf.o
RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
+OBJS-$(CONFIG_BSWAPDSP) += riscv/bswapdsp_init.o \
+ riscv/bswapdsp_rvb.o
OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o
RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o
OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
diff --git a/libavcodec/riscv/bswapdsp_init.c b/libavcodec/riscv/bswapdsp_init.c
new file mode 100644
index 0000000000..701dbeaaa6
--- /dev/null
+++ b/libavcodec/riscv/bswapdsp_init.c
@@ -0,0 +1,38 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <stdint.h>
+
+#include "config.h"
+#include "libavutil/attributes.h"
+#include "libavutil/cpu.h"
+#include "libavcodec/bswapdsp.h"
+
+void ff_bswap32_buf_rvb(uint32_t *dst, const uint32_t *src, int len);
+
+av_cold void ff_bswapdsp_init_riscv(BswapDSPContext *c)
+{
+#if (__riscv_xlen >= 64)
+ int cpu_flags = av_get_cpu_flags();
+
+ if (cpu_flags & AV_CPU_FLAG_RVB_BASIC)
+ c->bswap_buf = ff_bswap32_buf_rvb;
+#endif
+}
diff --git a/libavcodec/riscv/bswapdsp_rvb.S b/libavcodec/riscv/bswapdsp_rvb.S
new file mode 100644
index 0000000000..91b47bf82d
--- /dev/null
+++ b/libavcodec/riscv/bswapdsp_rvb.S
@@ -0,0 +1,68 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+#include "libavutil/riscv/asm.S"
+
+#if (__riscv_xlen >= 64)
+func ff_bswap32_buf_rvb, zbb
+ andi t0, a1, 4
+ beqz t0, 1f
+ /* Align a1 (input) to 64-bit */
+ lwu t0, (a1)
+ addi a0, a0, 4
+ rev8 t0, t0
+ addi a2, a2, -1
+ srli t0, t0, __riscv_xlen - 32
+ addi a1, a1, 4
+ sw t0, -4(a0)
+1:
+ andi a3, a2, -2
+ sh2add a2, a2, a0
+ beqz a3, 3f
+ sh2add a3, a3, a0
+2: /* 2 elements (64 bits) at a time on a 64-bit boundary */
+ ld t0, (a1)
+ addi a0, a0, 8
+ rev8 t0, t0
+#if (__riscv_xlen == 64)
+ srli t2, t0, 32
+ sw t0, -4(a0)
+#else
+ srli t1, t0, __riscv_xlen - 64
+ srli t2, t0, __riscv_xlen - 32
+ sw t1, -4(a0)
+#endif
+ addi a1, a1, 8
+ sw t2, -8(a0)
+ bne a0, a3, 2b
+3:
+ beq a0, a2, 5f
+4: /* Process last element */
+ lwu t0, (a1)
+ addi a0, a0, 4
+ rev8 t0, t0
+ addi a1, a1, 4
+ srli t0, t0, __riscv_xlen - 32
+ sw t0, -4(a0)
+5:
+ ret
+endfunc
+#endif
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 6+ messages in thread
* [FFmpeg-devel] [PATCH 3/4] lavc/bswapdsp: RISC-V V bswap_buf
2022-10-02 11:54 [FFmpeg-devel] [PATCH 0/4] RISC-V initial bswapdsp Rémi Denis-Courmont
2022-10-02 11:54 ` [FFmpeg-devel] [PATCH 1/4] lavu/riscv: CPU flag for the Zbb extension remi
2022-10-02 11:54 ` [FFmpeg-devel] [PATCH 2/4] lavc/bswapdsp: RISC-V B bswap_buf remi
@ 2022-10-02 11:55 ` remi
2022-10-02 11:55 ` [FFmpeg-devel] [PATCH 4/4] lavc/bswapdsp: RISC-V V bswap16_buf remi
3 siblings, 0 replies; 6+ messages in thread
From: remi @ 2022-10-02 11:55 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/Makefile | 1 +
libavcodec/riscv/bswapdsp_init.c | 7 ++++-
libavcodec/riscv/bswapdsp_rvv.S | 45 ++++++++++++++++++++++++++++++++
3 files changed, 52 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/riscv/bswapdsp_rvv.S
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index db4384bca7..b94901ce8d 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -5,6 +5,7 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
OBJS-$(CONFIG_BSWAPDSP) += riscv/bswapdsp_init.o \
riscv/bswapdsp_rvb.o
+RVV-OBJS-$(CONFIG_BSWAPDSP) += riscv/bswapdsp_rvv.o
OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o
RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o
OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
diff --git a/libavcodec/riscv/bswapdsp_init.c b/libavcodec/riscv/bswapdsp_init.c
index 701dbeaaa6..c17b6b75bb 100644
--- a/libavcodec/riscv/bswapdsp_init.c
+++ b/libavcodec/riscv/bswapdsp_init.c
@@ -26,13 +26,18 @@
#include "libavcodec/bswapdsp.h"
void ff_bswap32_buf_rvb(uint32_t *dst, const uint32_t *src, int len);
+void ff_bswap32_buf_rvv(uint32_t *dst, const uint32_t *src, int len);
av_cold void ff_bswapdsp_init_riscv(BswapDSPContext *c)
{
-#if (__riscv_xlen >= 64)
int cpu_flags = av_get_cpu_flags();
+#if (__riscv_xlen >= 64)
if (cpu_flags & AV_CPU_FLAG_RVB_BASIC)
c->bswap_buf = ff_bswap32_buf_rvb;
#endif
+#if HAVE_RVV
+ if (cpu_flags & AV_CPU_FLAG_RVV_I32)
+ c->bswap_buf = ff_bswap32_buf_rvv;
+#endif
}
diff --git a/libavcodec/riscv/bswapdsp_rvv.S b/libavcodec/riscv/bswapdsp_rvv.S
new file mode 100644
index 0000000000..7ea747b3ce
--- /dev/null
+++ b/libavcodec/riscv/bswapdsp_rvv.S
@@ -0,0 +1,45 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+#include "libavutil/riscv/asm.S"
+
+func ff_bswap32_buf_rvv, zve32x
+ li t4, 4
+ addi t1, a0, 1
+ addi t2, a0, 2
+ addi t3, a0, 3
+1:
+ vsetvli t0, a2, e8, m1, ta, ma
+ vlseg4e8.v v8, (a1)
+ sub a2, a2, t0
+ sh2add a1, t0, a1
+ vsse8.v v8, (t3), t4
+ sh2add t3, t0, t3
+ vsse8.v v9, (t2), t4
+ sh2add t2, t0, t2
+ vsse8.v v10, (t1), t4
+ sh2add t1, t0, t1
+ vsse8.v v11, (a0), t4
+ sh2add a0, t0, a0
+ bnez a2, 1b
+
+ ret
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 6+ messages in thread
* [FFmpeg-devel] [PATCH 4/4] lavc/bswapdsp: RISC-V V bswap16_buf
2022-10-02 11:54 [FFmpeg-devel] [PATCH 0/4] RISC-V initial bswapdsp Rémi Denis-Courmont
` (2 preceding siblings ...)
2022-10-02 11:55 ` [FFmpeg-devel] [PATCH 3/4] lavc/bswapdsp: RISC-V V bswap_buf remi
@ 2022-10-02 11:55 ` remi
2022-10-05 6:36 ` Lynne
3 siblings, 1 reply; 6+ messages in thread
From: remi @ 2022-10-02 11:55 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/bswapdsp_init.c | 5 ++++-
libavcodec/riscv/bswapdsp_rvv.S | 17 +++++++++++++++++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/bswapdsp_init.c b/libavcodec/riscv/bswapdsp_init.c
index c17b6b75bb..abe84ec1f7 100644
--- a/libavcodec/riscv/bswapdsp_init.c
+++ b/libavcodec/riscv/bswapdsp_init.c
@@ -27,6 +27,7 @@
void ff_bswap32_buf_rvb(uint32_t *dst, const uint32_t *src, int len);
void ff_bswap32_buf_rvv(uint32_t *dst, const uint32_t *src, int len);
+void ff_bswap16_buf_rvv(uint16_t *dst, const uint16_t *src, int len);
av_cold void ff_bswapdsp_init_riscv(BswapDSPContext *c)
{
@@ -37,7 +38,9 @@ av_cold void ff_bswapdsp_init_riscv(BswapDSPContext *c)
c->bswap_buf = ff_bswap32_buf_rvb;
#endif
#if HAVE_RVV
- if (cpu_flags & AV_CPU_FLAG_RVV_I32)
+ if (cpu_flags & AV_CPU_FLAG_RVV_I32) {
c->bswap_buf = ff_bswap32_buf_rvv;
+ c->bswap16_buf = ff_bswap16_buf_rvv;
+ }
#endif
}
diff --git a/libavcodec/riscv/bswapdsp_rvv.S b/libavcodec/riscv/bswapdsp_rvv.S
index 7ea747b3ce..ef2999c1be 100644
--- a/libavcodec/riscv/bswapdsp_rvv.S
+++ b/libavcodec/riscv/bswapdsp_rvv.S
@@ -43,3 +43,20 @@ func ff_bswap32_buf_rvv, zve32x
ret
endfunc
+
+func ff_bswap16_buf_rvv, zve32x
+ li t2, 2
+ addi t1, a0, 1
+1:
+ vsetvli t0, a2, e8, m1, ta, ma
+ vlseg2e8.v v8, (a1)
+ sub a2, a2, t0
+ sh1add a1, t0, a1
+ vsse8.v v8, (t1), t2
+ sh1add t1, t0, t1
+ vsse8.v v9, (a0), t2
+ sh1add a0, t0, a0
+ bnez a2, 1b
+
+ ret
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [FFmpeg-devel] [PATCH 4/4] lavc/bswapdsp: RISC-V V bswap16_buf
2022-10-02 11:55 ` [FFmpeg-devel] [PATCH 4/4] lavc/bswapdsp: RISC-V V bswap16_buf remi
@ 2022-10-05 6:36 ` Lynne
0 siblings, 0 replies; 6+ messages in thread
From: Lynne @ 2022-10-05 6:36 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Oct 2, 2022, 13:55 by remi@remlab.net:
> From: Rémi Denis-Courmont <remi@remlab.net>
>
> ---
> libavcodec/riscv/bswapdsp_init.c | 5 ++++-
> libavcodec/riscv/bswapdsp_rvv.S | 17 +++++++++++++++++
> 2 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/riscv/bswapdsp_init.c b/libavcodec/riscv/bswapdsp_init.c
> index c17b6b75bb..abe84ec1f7 100644
> --- a/libavcodec/riscv/bswapdsp_init.c
> +++ b/libavcodec/riscv/bswapdsp_init.c
> @@ -27,6 +27,7 @@
>
> void ff_bswap32_buf_rvb(uint32_t *dst, const uint32_t *src, int len);
> void ff_bswap32_buf_rvv(uint32_t *dst, const uint32_t *src, int len);
> +void ff_bswap16_buf_rvv(uint16_t *dst, const uint16_t *src, int len);
>
> av_cold void ff_bswapdsp_init_riscv(BswapDSPContext *c)
> {
> @@ -37,7 +38,9 @@ av_cold void ff_bswapdsp_init_riscv(BswapDSPContext *c)
> c->bswap_buf = ff_bswap32_buf_rvb;
> #endif
> #if HAVE_RVV
> - if (cpu_flags & AV_CPU_FLAG_RVV_I32)
> + if (cpu_flags & AV_CPU_FLAG_RVV_I32) {
> c->bswap_buf = ff_bswap32_buf_rvv;
> + c->bswap16_buf = ff_bswap16_buf_rvv;
> + }
> #endif
> }
> diff --git a/libavcodec/riscv/bswapdsp_rvv.S b/libavcodec/riscv/bswapdsp_rvv.S
> index 7ea747b3ce..ef2999c1be 100644
> --- a/libavcodec/riscv/bswapdsp_rvv.S
> +++ b/libavcodec/riscv/bswapdsp_rvv.S
> @@ -43,3 +43,20 @@ func ff_bswap32_buf_rvv, zve32x
>
> ret
> endfunc
> +
> +func ff_bswap16_buf_rvv, zve32x
> + li t2, 2
> + addi t1, a0, 1
> +1:
> + vsetvli t0, a2, e8, m1, ta, ma
> + vlseg2e8.v v8, (a1)
> + sub a2, a2, t0
> + sh1add a1, t0, a1
> + vsse8.v v8, (t1), t2
> + sh1add t1, t0, t1
> + vsse8.v v9, (a0), t2
> + sh1add a0, t0, a0
> + bnez a2, 1b
> +
> + ret
> +endfunc
>
Pushed patchset with a minor bump and apichanges
Thanks
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 6+ messages in thread