* [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings
@ 2022-09-27 20:04 Rémi Denis-Courmont
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 1/7] lavu/riscv: helper to read the vector length remi
` (7 more replies)
0 siblings, 8 replies; 13+ messages in thread
From: Rémi Denis-Courmont @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
Hello,
As a general rule, scalable vector instruction sets should be used with the
largest possible vector length. There are however a number of operations that
just happen with a fixed size, and this patchset exhibits the simplest one I
could find. The proper RISC-V Vector extension guarantees a minimum vector
length of 128 bits. In theory though the underlying specification also allows
for (embedded designs with) only 32 or 64 bits per vector.
The RFC is how should this be dealt with? The simplest possibibility is to
simply assume 128 bits. Indeed, I am not aware of any actual or proposed
processor IP with shorter-than-128-bit vectors, even less so, one upon which
FFmpeg would be used. For what it is worth, ARM SVE guarantees a minimum of
128 bits per vector too. In that case, we can drop the first patch, and
simplify the following ones.
Another option is to expose the vector length via the CPU flags as proposed
earlier by Lynne. Though this is unorthodox the vector length is not a proper
flag. The vector length can readily be retrieved from a read-only unprivileged
CSR, and this patchset instead introduces a simple inline wrapper therefore.
The downside of this approach is that this is nominally undefined behaviour,
and typically will raise a SIGILL, if the processor does not support the
vector extension.
However I want to emphasise that the same problem also exists for DSP
functions operating on more than 128 bits. For instance, the inner loop of the
Opus post-filter works with 160 bits. So then, we cannot simply ignore the
variability between processors. RISC-V has existing designs with 128-bit
vectors and announced designs with 256-bit and 512-bit vectors.
I don't personally have any strong preference for or against either of the CPU
flags or the dedicated platform-specific helper approaches. And besides, I do
not have the pretense to decide on FFmpeg internal design. But I doubt that
this concern can be ignored entirely.
The following changes since commit 59cb0bd23d61f6ea3bfd86558346e2720aba7f06:
avfilter/vf_extractplanes: add missing break; statement (2022-09-27 19:35:49 +0200)
are available in the Git repository at:
https://git.remlab.net/git/ffmpeg.git rvv-vlen
for you to fetch changes up to 1b80effc9798c164fd7c1953174ccf4a66298aff:
lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned (2022-09-27 22:19:19 +0300)
----------------------------------------------------------------
Rémi Denis-Courmont (7):
lavu/riscv: helper to read the vector length
lavc/idctdsp: RISC-V V put_pixels_clamped function
lavc/idctdsp: RISC-V V add_pixels_clamped function
lavc/idctdsp: RISC-V V put_signed_pixels_clamped function
lavc/pixblockdsp: RISC-V V 8-bit get_pixels & get_pixels_unaligned
lavc/pixblockdsp: RISC-V V 16-bit get_pixels & get_pixels_unaligned
lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned
libavcodec/idctdsp.c | 2 +
libavcodec/idctdsp.h | 2 +
libavcodec/riscv/Makefile | 3 ++
libavcodec/riscv/idctdsp_init.c | 48 ++++++++++++++++++++++
libavcodec/riscv/idctdsp_rvv.S | 80 +++++++++++++++++++++++++++++++++++++
libavcodec/riscv/pixblockdsp_init.c | 20 ++++++++++
libavcodec/riscv/pixblockdsp_rvv.S | 60 ++++++++++++++++++++++++++++
libavutil/riscv/cpu.h | 45 +++++++++++++++++++++
8 files changed, 260 insertions(+)
create mode 100644 libavcodec/riscv/idctdsp_init.c
create mode 100644 libavcodec/riscv/idctdsp_rvv.S
create mode 100644 libavcodec/riscv/pixblockdsp_rvv.S
create mode 100644 libavutil/riscv/cpu.h
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 1/7] lavu/riscv: helper to read the vector length
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
@ 2022-09-27 20:04 ` remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function remi
` (6 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavutil/riscv/cpu.h | 45 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
create mode 100644 libavutil/riscv/cpu.h
diff --git a/libavutil/riscv/cpu.h b/libavutil/riscv/cpu.h
new file mode 100644
index 0000000000..56035f8556
--- /dev/null
+++ b/libavutil/riscv/cpu.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_RISCV_CPU_H
+#define AVUTIL_RISCV_CPU_H
+
+#include "config.h"
+#include <stddef.h>
+#include "libavutil/cpu.h"
+
+#if HAVE_RVV
+/**
+ * Returns the vector size in bytes (always a power of two and at least 4).
+ * This is undefined behaviour if vectors are not implemented.
+ */
+static inline size_t ff_get_rv_vlenb(void)
+{
+ size_t vlenb;
+
+ __asm__ (
+ ".option push\n"
+ ".option arch, +v\n"
+ " csrr %0, vlenb\n"
+ ".option pop\n" : "=r" (vlenb));
+ return vlenb;
+}
+#endif
+#endif
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 1/7] lavu/riscv: helper to read the vector length remi
@ 2022-09-27 20:04 ` remi
2022-09-28 8:06 ` Rémi Denis-Courmont
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 3/7] lavc/idctdsp: RISC-V V add_pixels_clamped function remi
` (5 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/idctdsp.c | 2 ++
libavcodec/idctdsp.h | 2 ++
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/idctdsp_init.c | 41 +++++++++++++++++++++++++++++++
libavcodec/riscv/idctdsp_rvv.S | 43 +++++++++++++++++++++++++++++++++
5 files changed, 90 insertions(+)
create mode 100644 libavcodec/riscv/idctdsp_init.c
create mode 100644 libavcodec/riscv/idctdsp_rvv.S
diff --git a/libavcodec/idctdsp.c b/libavcodec/idctdsp.c
index 9035003b72..4ee9c3aa74 100644
--- a/libavcodec/idctdsp.c
+++ b/libavcodec/idctdsp.c
@@ -312,6 +312,8 @@ av_cold void ff_idctdsp_init(IDCTDSPContext *c, AVCodecContext *avctx)
ff_idctdsp_init_arm(c, avctx, high_bit_depth);
#elif ARCH_PPC
ff_idctdsp_init_ppc(c, avctx, high_bit_depth);
+#elif ARCH_RISCV
+ ff_idctdsp_init_riscv(c, avctx, high_bit_depth);
#elif ARCH_X86
ff_idctdsp_init_x86(c, avctx, high_bit_depth);
#elif ARCH_MIPS
diff --git a/libavcodec/idctdsp.h b/libavcodec/idctdsp.h
index e8f20acaf2..2bd9820f72 100644
--- a/libavcodec/idctdsp.h
+++ b/libavcodec/idctdsp.h
@@ -114,6 +114,8 @@ void ff_idctdsp_init_arm(IDCTDSPContext *c, AVCodecContext *avctx,
unsigned high_bit_depth);
void ff_idctdsp_init_ppc(IDCTDSPContext *c, AVCodecContext *avctx,
unsigned high_bit_depth);
+void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
+ unsigned high_bit_depth);
void ff_idctdsp_init_x86(IDCTDSPContext *c, AVCodecContext *avctx,
unsigned high_bit_depth);
void ff_idctdsp_init_mips(IDCTDSPContext *c, AVCodecContext *avctx,
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index 829a1823d2..96925afdab 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -5,6 +5,8 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o
RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o
+OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
+RVV-OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_rvv.o
OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \
riscv/pixblockdsp_rvi.o
OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o
diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
new file mode 100644
index 0000000000..1a6add80da
--- /dev/null
+++ b/libavcodec/riscv/idctdsp_init.c
@@ -0,0 +1,41 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <stdint.h>
+
+#include "libavutil/attributes.h"
+#include "libavutil/cpu.h"
+#include "libavutil/riscv/cpu.h"
+#include "libavcodec/avcodec.h"
+#include "libavcodec/idctdsp.h"
+
+void ff_put_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
+ ptrdiff_t stride);
+
+av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
+ unsigned high_bit_depth)
+{
+#if HAVE_RVV
+ int flags = av_get_cpu_flags();
+
+ if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16)
+ c->put_pixels_clamped = ff_put_pixels_clamped_rvv;
+#endif
+}
diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S
new file mode 100644
index 0000000000..a59edd0a83
--- /dev/null
+++ b/libavcodec/riscv/idctdsp_rvv.S
@@ -0,0 +1,43 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+#include "../libavutil/riscv/asm.S"
+
+func ff_put_pixels_clamped_rvv, zve32x
+ vsetivli zero, 8, e16, m1, ta, ma
+ vlseg8e16.v v24, (a0)
+ /* RVV only has signed-signed and unsigned-unsigned clipping.
+ * We need two steps for signed-to-unsigned clipping. */
+ vsetvli t0, zero, e16, m8, ta, ma
+ vmax.vx v24, v24, zero
+
+ vsetivli zero, 8, e8, mf2, ta, ma
+ vnclipu.wi v16, v24, 0
+ vnclipu.wi v17, v25, 0
+ vnclipu.wi v18, v26, 0
+ vnclipu.wi v19, v27, 0
+ vnclipu.wi v20, v28, 0
+ vnclipu.wi v21, v29, 0
+ vnclipu.wi v22, v30, 0
+ vnclipu.wi v23, v31, 0
+ vssseg8e8.v v16, (a1), a2
+ ret
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 3/7] lavc/idctdsp: RISC-V V add_pixels_clamped function
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 1/7] lavu/riscv: helper to read the vector length remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function remi
@ 2022-09-27 20:04 ` remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 4/7] lavc/idctdsp: RISC-V V put_signed_pixels_clamped function remi
` (4 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/idctdsp_init.c | 6 +++++-
libavcodec/riscv/idctdsp_rvv.S | 16 ++++++++++++++++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
index 1a6add80da..58b8a6c97a 100644
--- a/libavcodec/riscv/idctdsp_init.c
+++ b/libavcodec/riscv/idctdsp_init.c
@@ -28,6 +28,8 @@
void ff_put_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
ptrdiff_t stride);
+void ff_add_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
+ ptrdiff_t stride);
av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
unsigned high_bit_depth)
@@ -35,7 +37,9 @@ av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
#if HAVE_RVV
int flags = av_get_cpu_flags();
- if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16)
+ if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) {
c->put_pixels_clamped = ff_put_pixels_clamped_rvv;
+ c->add_pixels_clamped = ff_add_pixels_clamped_rvv;
+ }
#endif
}
diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S
index a59edd0a83..e6cb53bd6f 100644
--- a/libavcodec/riscv/idctdsp_rvv.S
+++ b/libavcodec/riscv/idctdsp_rvv.S
@@ -24,6 +24,7 @@
func ff_put_pixels_clamped_rvv, zve32x
vsetivli zero, 8, e16, m1, ta, ma
vlseg8e16.v v24, (a0)
+1:
/* RVV only has signed-signed and unsigned-unsigned clipping.
* We need two steps for signed-to-unsigned clipping. */
vsetvli t0, zero, e16, m8, ta, ma
@@ -41,3 +42,18 @@ func ff_put_pixels_clamped_rvv, zve32x
vssseg8e8.v v16, (a1), a2
ret
endfunc
+
+func ff_add_pixels_clamped_rvv, zve32x
+ vsetivli zero, 8, e8, mf2, ta, ma
+ vlseg8e16.v v24, (a0)
+ vlsseg8e8.v v16, (a1), a2
+ vwaddu.wv v24, v24, v16
+ vwaddu.wv v25, v25, v17
+ vwaddu.wv v26, v26, v18
+ vwaddu.wv v27, v27, v19
+ vwaddu.wv v28, v28, v20
+ vwaddu.wv v29, v29, v21
+ vwaddu.wv v30, v30, v22
+ vwaddu.wv v31, v31, v23
+ j 1b
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 4/7] lavc/idctdsp: RISC-V V put_signed_pixels_clamped function
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
` (2 preceding siblings ...)
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 3/7] lavc/idctdsp: RISC-V V add_pixels_clamped function remi
@ 2022-09-27 20:04 ` remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 5/7] lavc/pixblockdsp: RISC-V V 8-bit get_pixels & get_pixels_unaligned remi
` (3 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/idctdsp_init.c | 3 +++
libavcodec/riscv/idctdsp_rvv.S | 21 +++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
index 58b8a6c97a..e6e616a555 100644
--- a/libavcodec/riscv/idctdsp_init.c
+++ b/libavcodec/riscv/idctdsp_init.c
@@ -28,6 +28,8 @@
void ff_put_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
ptrdiff_t stride);
+void ff_put_signed_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
+ ptrdiff_t stride);
void ff_add_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
ptrdiff_t stride);
@@ -39,6 +41,7 @@ av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) {
c->put_pixels_clamped = ff_put_pixels_clamped_rvv;
+ c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_rvv;
c->add_pixels_clamped = ff_add_pixels_clamped_rvv;
}
#endif
diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S
index e6cb53bd6f..e0077cc1b4 100644
--- a/libavcodec/riscv/idctdsp_rvv.S
+++ b/libavcodec/riscv/idctdsp_rvv.S
@@ -43,6 +43,27 @@ func ff_put_pixels_clamped_rvv, zve32x
ret
endfunc
+func ff_put_signed_pixels_clamped_rvv, zve32x
+ vsetivli zero, 8, e16, m1, ta, ma
+ vlseg8e16.v v24, (a0)
+
+ li t1, 128
+ vsetivli zero, 8, e8, mf2, ta, ma
+ vnclip.wi v16, v24, 0
+ vnclip.wi v17, v25, 0
+ vnclip.wi v18, v26, 0
+ vnclip.wi v19, v27, 0
+ vnclip.wi v20, v28, 0
+ vnclip.wi v21, v29, 0
+ vnclip.wi v22, v30, 0
+ vnclip.wi v23, v31, 0
+ vsetvli t0, zero, e8, m8, ta, ma
+ vadd.vx v16, v16, t1
+ vsetivli zero, 8, e8, mf2, ta, ma
+ vssseg8e8.v v16, (a1), a2
+ ret
+endfunc
+
func ff_add_pixels_clamped_rvv, zve32x
vsetivli zero, 8, e8, mf2, ta, ma
vlseg8e16.v v24, (a0)
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 5/7] lavc/pixblockdsp: RISC-V V 8-bit get_pixels & get_pixels_unaligned
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
` (3 preceding siblings ...)
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 4/7] lavc/idctdsp: RISC-V V put_signed_pixels_clamped function remi
@ 2022-09-27 20:04 ` remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 6/7] lavc/pixblockdsp: RISC-V V 16-bit " remi
` (2 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/Makefile | 1 +
libavcodec/riscv/pixblockdsp_init.c | 12 ++++++++++
libavcodec/riscv/pixblockdsp_rvv.S | 37 +++++++++++++++++++++++++++++
3 files changed, 50 insertions(+)
create mode 100644 libavcodec/riscv/pixblockdsp_rvv.S
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index 96925afdab..0fb2c81c75 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -9,5 +9,6 @@ OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
RVV-OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_rvv.o
OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \
riscv/pixblockdsp_rvi.o
+RVV-OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_rvv.o
OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o
RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o
diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c
index 04bf52649f..69dbd18918 100644
--- a/libavcodec/riscv/pixblockdsp_init.c
+++ b/libavcodec/riscv/pixblockdsp_init.c
@@ -20,8 +20,10 @@
#include <stdint.h>
+#include "config.h"
#include "libavutil/attributes.h"
#include "libavutil/cpu.h"
+#include "libavutil/riscv/cpu.h"
#include "libavcodec/avcodec.h"
#include "libavcodec/pixblockdsp.h"
@@ -30,6 +32,9 @@ void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels,
void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels,
ptrdiff_t stride);
+void ff_get_pixels_8_rvv(int16_t *block, const uint8_t *pixels,
+ ptrdiff_t stride);
+
av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c,
AVCodecContext *avctx,
unsigned high_bit_depth)
@@ -42,4 +47,11 @@ av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c,
else
c->get_pixels = ff_get_pixels_8_rvi;
}
+
+#if HAVE_RVV
+ if ((cpu_flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) {
+ if (!high_bit_depth)
+ c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_8_rvv;
+ }
+#endif
}
diff --git a/libavcodec/riscv/pixblockdsp_rvv.S b/libavcodec/riscv/pixblockdsp_rvv.S
new file mode 100644
index 0000000000..b7c74b88b5
--- /dev/null
+++ b/libavcodec/riscv/pixblockdsp_rvv.S
@@ -0,0 +1,37 @@
+/*
+ * Copyright © 2022 Rémi Denis-Courmont.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+#include "../libavutil/riscv/asm.S"
+
+func ff_get_pixels_8_rvv, zve32x
+ vsetivli zero, 8, e8, mf2, ta, ma
+ vlsseg8e8.v v16, (a1), a2
+ vwcvtu.x.x.v v8, v16
+ vwcvtu.x.x.v v9, v17
+ vwcvtu.x.x.v v10, v18
+ vwcvtu.x.x.v v11, v19
+ vwcvtu.x.x.v v12, v20
+ vwcvtu.x.x.v v13, v21
+ vwcvtu.x.x.v v14, v22
+ vwcvtu.x.x.v v15, v23
+ vsseg8e16.v v8, (a0)
+ ret
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 6/7] lavc/pixblockdsp: RISC-V V 16-bit get_pixels & get_pixels_unaligned
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
` (4 preceding siblings ...)
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 5/7] lavc/pixblockdsp: RISC-V V 8-bit get_pixels & get_pixels_unaligned remi
@ 2022-09-27 20:04 ` remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 7/7] lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned remi
2022-09-27 21:32 ` [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Lynne
7 siblings, 0 replies; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/pixblockdsp_init.c | 6 +++++-
libavcodec/riscv/pixblockdsp_rvv.S | 7 +++++++
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c
index 69dbd18918..bbda381c12 100644
--- a/libavcodec/riscv/pixblockdsp_init.c
+++ b/libavcodec/riscv/pixblockdsp_init.c
@@ -34,6 +34,8 @@ void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels,
void ff_get_pixels_8_rvv(int16_t *block, const uint8_t *pixels,
ptrdiff_t stride);
+void ff_get_pixels_16_rvv(int16_t *block, const uint8_t *pixels,
+ ptrdiff_t stride);
av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c,
AVCodecContext *avctx,
@@ -50,7 +52,9 @@ av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c,
#if HAVE_RVV
if ((cpu_flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) {
- if (!high_bit_depth)
+ if (high_bit_depth)
+ c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_16_rvv;
+ else
c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_8_rvv;
}
#endif
diff --git a/libavcodec/riscv/pixblockdsp_rvv.S b/libavcodec/riscv/pixblockdsp_rvv.S
index b7c74b88b5..5bf83ebe5e 100644
--- a/libavcodec/riscv/pixblockdsp_rvv.S
+++ b/libavcodec/riscv/pixblockdsp_rvv.S
@@ -35,3 +35,10 @@ func ff_get_pixels_8_rvv, zve32x
vsseg8e16.v v8, (a0)
ret
endfunc
+
+func ff_get_pixels_16_rvv, zve32x
+ vsetivli zero, 8, e16, m1, ta, ma
+ vlsseg8e16.v v0, (a1), a2
+ vsseg8e16.v v0, (a0)
+ ret
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* [FFmpeg-devel] [PATCH 7/7] lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
` (5 preceding siblings ...)
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 6/7] lavc/pixblockdsp: RISC-V V 16-bit " remi
@ 2022-09-27 20:04 ` remi
2022-09-27 21:32 ` [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Lynne
7 siblings, 0 replies; 13+ messages in thread
From: remi @ 2022-09-27 20:04 UTC (permalink / raw)
To: ffmpeg-devel
From: Rémi Denis-Courmont <remi@remlab.net>
---
libavcodec/riscv/pixblockdsp_init.c | 4 ++++
libavcodec/riscv/pixblockdsp_rvv.S | 16 ++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c
index bbda381c12..aa39a8a665 100644
--- a/libavcodec/riscv/pixblockdsp_init.c
+++ b/libavcodec/riscv/pixblockdsp_init.c
@@ -36,6 +36,8 @@ void ff_get_pixels_8_rvv(int16_t *block, const uint8_t *pixels,
ptrdiff_t stride);
void ff_get_pixels_16_rvv(int16_t *block, const uint8_t *pixels,
ptrdiff_t stride);
+void ff_diff_pixels_rvv(int16_t *block, const uint8_t *s1, const uint8_t *s2,
+ ptrdiff_t stride);
av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c,
AVCodecContext *avctx,
@@ -56,6 +58,8 @@ av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c,
c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_16_rvv;
else
c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_8_rvv;
+
+ c->diff_pixels_unaligned = c->diff_pixels = ff_diff_pixels_rvv;
}
#endif
}
diff --git a/libavcodec/riscv/pixblockdsp_rvv.S b/libavcodec/riscv/pixblockdsp_rvv.S
index 5bf83ebe5e..62cdfe22b1 100644
--- a/libavcodec/riscv/pixblockdsp_rvv.S
+++ b/libavcodec/riscv/pixblockdsp_rvv.S
@@ -42,3 +42,19 @@ func ff_get_pixels_16_rvv, zve32x
vsseg8e16.v v0, (a0)
ret
endfunc
+
+func ff_diff_pixels_rvv, zve32x
+ vsetivli zero, 8, e8, mf2, ta, ma
+ vlsseg8e8.v v16, (a1), a3
+ vlsseg8e8.v v24, (a2), a3
+ vwsubu.vv v8, v16, v24
+ vwsubu.vv v9, v17, v25
+ vwsubu.vv v10, v18, v26
+ vwsubu.vv v11, v19, v27
+ vwsubu.vv v12, v20, v28
+ vwsubu.vv v13, v21, v29
+ vwsubu.vv v14, v22, v30
+ vwsubu.vv v15, v23, v31
+ vsseg8e16.v v8, (a0)
+ ret
+endfunc
--
2.37.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
` (6 preceding siblings ...)
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 7/7] lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned remi
@ 2022-09-27 21:32 ` Lynne
2022-09-28 6:03 ` Rémi Denis-Courmont
7 siblings, 1 reply; 13+ messages in thread
From: Lynne @ 2022-09-27 21:32 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Sep 27, 2022, 22:04 by remi@remlab.net:
> Hello,
>
> As a general rule, scalable vector instruction sets should be used with the
> largest possible vector length. There are however a number of operations that
> just happen with a fixed size, and this patchset exhibits the simplest one I
> could find. The proper RISC-V Vector extension guarantees a minimum vector
> length of 128 bits. In theory though the underlying specification also allows
> for (embedded designs with) only 32 or 64 bits per vector.
>
> The RFC is how should this be dealt with? The simplest possibibility is to
> simply assume 128 bits. Indeed, I am not aware of any actual or proposed
> processor IP with shorter-than-128-bit vectors, even less so, one upon which
> FFmpeg would be used. For what it is worth, ARM SVE guarantees a minimum of
> 128 bits per vector too. In that case, we can drop the first patch, and
> simplify the following ones.
>
> Another option is to expose the vector length via the CPU flags as proposed
> earlier by Lynne. Though this is unorthodox the vector length is not a proper
> flag. The vector length can readily be retrieved from a read-only unprivileged
> CSR, and this patchset instead introduces a simple inline wrapper therefore.
> The downside of this approach is that this is nominally undefined behaviour,
> and typically will raise a SIGILL, if the processor does not support the
> vector extension.
>
Where's the undefined behavior? If it's guarded by an if, the
function will return the maximum length. I don't mind that it's not
a cpuflag.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings
2022-09-27 21:32 ` [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Lynne
@ 2022-09-28 6:03 ` Rémi Denis-Courmont
2022-09-28 6:51 ` Lynne
0 siblings, 1 reply; 13+ messages in thread
From: Rémi Denis-Courmont @ 2022-09-28 6:03 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 28 septembre 2022 00:32:42 GMT+03:00, Lynne <dev@lynne.ee> a écrit :
>Sep 27, 2022, 22:04 by remi@remlab.net:
>
>> Hello,
>>
>> As a general rule, scalable vector instruction sets should be used with the
>> largest possible vector length. There are however a number of operations that
>> just happen with a fixed size, and this patchset exhibits the simplest one I
>> could find. The proper RISC-V Vector extension guarantees a minimum vector
>> length of 128 bits. In theory though the underlying specification also allows
>> for (embedded designs with) only 32 or 64 bits per vector.
>>
>> The RFC is how should this be dealt with? The simplest possibibility is to
>> simply assume 128 bits. Indeed, I am not aware of any actual or proposed
>> processor IP with shorter-than-128-bit vectors, even less so, one upon which
>> FFmpeg would be used. For what it is worth, ARM SVE guarantees a minimum of
>> 128 bits per vector too. In that case, we can drop the first patch, and
>> simplify the following ones.
>>
>> Another option is to expose the vector length via the CPU flags as proposed
>> earlier by Lynne. Though this is unorthodox the vector length is not a proper
>> flag. The vector length can readily be retrieved from a read-only unprivileged
>> CSR, and this patchset instead introduces a simple inline wrapper therefore.
>> The downside of this approach is that this is nominally undefined behaviour,
>> and typically will raise a SIGILL, if the processor does not support the
>> vector extension.
>>
>
>Where's the undefined behavior? If it's guarded by an if, the
>function will return the maximum length. I don't mind that it's not
>a cpuflag.
>
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel@ffmpeg.org
>https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>To unsubscribe, visit link above, or email
>ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
The UB occurs if the helper is called on a CPU without vectors. There is no (I think) UB in the patchset. My point is rather that it might be prone to programming errors, not that they would be actual errors already.
Presumably someone could accidentally insert or move a call to ff_rv_get_vlenb() before the proper CPU flags check, and not notice that it would cause UB on some CPUs.
With that said, if there are no objections to the approach in this series, I'm obviously fine with that.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings
2022-09-28 6:03 ` Rémi Denis-Courmont
@ 2022-09-28 6:51 ` Lynne
0 siblings, 0 replies; 13+ messages in thread
From: Lynne @ 2022-09-28 6:51 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Sep 28, 2022, 08:03 by remi@remlab.net:
> Le 28 septembre 2022 00:32:42 GMT+03:00, Lynne <dev@lynne.ee> a écrit :
> >Sep 27, 2022, 22:04 by remi@remlab.net:
>
>>> Hello,
>>>
>>> As a general rule, scalable vector instruction sets should be used with the
>>> largest possible vector length. There are however a number of operations that
>>> just happen with a fixed size, and this patchset exhibits the simplest one I
>>> could find. The proper RISC-V Vector extension guarantees a minimum vector
>>> length of 128 bits. In theory though the underlying specification also allows
>>> for (embedded designs with) only 32 or 64 bits per vector.
>>>
>>> The RFC is how should this be dealt with? The simplest possibibility is to
>>> simply assume 128 bits. Indeed, I am not aware of any actual or proposed
>>> processor IP with shorter-than-128-bit vectors, even less so, one upon which
>>> FFmpeg would be used. For what it is worth, ARM SVE guarantees a minimum of
>>> 128 bits per vector too. In that case, we can drop the first patch, and
>>> simplify the following ones.
>>>
>>> Another option is to expose the vector length via the CPU flags as proposed
>>> earlier by Lynne. Though this is unorthodox the vector length is not a proper
>>> flag. The vector length can readily be retrieved from a read-only unprivileged
>>> CSR, and this patchset instead introduces a simple inline wrapper therefore.
>>> The downside of this approach is that this is nominally undefined behaviour,
>>> and typically will raise a SIGILL, if the processor does not support the
>>> vector extension.
>>>
> >Where's the undefined behavior? If it's guarded by an if, the
> >function will return the maximum length. I don't mind that it's not
> >a cpuflag.
>
>>
>>
> >_______________________________________________
> >ffmpeg-devel mailing list
> >ffmpeg-devel@ffmpeg.org
> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>>
>>
> >To unsubscribe, visit link above, or email
> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
>>
>>
>
> The UB occurs if the helper is called on a CPU without vectors. There is no (I think) UB in the patchset. My point is rather that it might be prone to programming errors, not that they would be actual errors already.
>
> Presumably someone could accidentally insert or move a call to ff_rv_get_vlenb() before the proper CPU flags check, and not notice that it would cause UB on some CPUs.
>
> With that said, if there are no objections to the approach in this series, I'm obviously fine with that.
>
I think it's fine as-is. I'm sure we'll get more fate systems other than your U74
to check for that in the future.
Patchset looks good to me. Will apply in a few hours.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function remi
@ 2022-09-28 8:06 ` Rémi Denis-Courmont
2022-09-28 9:48 ` Lynne
0 siblings, 1 reply; 13+ messages in thread
From: Rémi Denis-Courmont @ 2022-09-28 8:06 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le 27 septembre 2022 23:04:22 GMT+03:00, remi@remlab.net a écrit :
>From: Rémi Denis-Courmont <remi@remlab.net>
>
>---
> libavcodec/idctdsp.c | 2 ++
> libavcodec/idctdsp.h | 2 ++
> libavcodec/riscv/Makefile | 2 ++
> libavcodec/riscv/idctdsp_init.c | 41 +++++++++++++++++++++++++++++++
> libavcodec/riscv/idctdsp_rvv.S | 43 +++++++++++++++++++++++++++++++++
> 5 files changed, 90 insertions(+)
> create mode 100644 libavcodec/riscv/idctdsp_init.c
> create mode 100644 libavcodec/riscv/idctdsp_rvv.S
>
>diff --git a/libavcodec/idctdsp.c b/libavcodec/idctdsp.c
>index 9035003b72..4ee9c3aa74 100644
>--- a/libavcodec/idctdsp.c
>+++ b/libavcodec/idctdsp.c
>@@ -312,6 +312,8 @@ av_cold void ff_idctdsp_init(IDCTDSPContext *c, AVCodecContext *avctx)
> ff_idctdsp_init_arm(c, avctx, high_bit_depth);
> #elif ARCH_PPC
> ff_idctdsp_init_ppc(c, avctx, high_bit_depth);
>+#elif ARCH_RISCV
>+ ff_idctdsp_init_riscv(c, avctx, high_bit_depth);
> #elif ARCH_X86
> ff_idctdsp_init_x86(c, avctx, high_bit_depth);
> #elif ARCH_MIPS
>diff --git a/libavcodec/idctdsp.h b/libavcodec/idctdsp.h
>index e8f20acaf2..2bd9820f72 100644
>--- a/libavcodec/idctdsp.h
>+++ b/libavcodec/idctdsp.h
>@@ -114,6 +114,8 @@ void ff_idctdsp_init_arm(IDCTDSPContext *c, AVCodecContext *avctx,
> unsigned high_bit_depth);
> void ff_idctdsp_init_ppc(IDCTDSPContext *c, AVCodecContext *avctx,
> unsigned high_bit_depth);
>+void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
>+ unsigned high_bit_depth);
> void ff_idctdsp_init_x86(IDCTDSPContext *c, AVCodecContext *avctx,
> unsigned high_bit_depth);
> void ff_idctdsp_init_mips(IDCTDSPContext *c, AVCodecContext *avctx,
>diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
>index 829a1823d2..96925afdab 100644
>--- a/libavcodec/riscv/Makefile
>+++ b/libavcodec/riscv/Makefile
>@@ -5,6 +5,8 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
> RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
> OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o
> RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o
>+OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
>+RVV-OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_rvv.o
> OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \
> riscv/pixblockdsp_rvi.o
> OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o
>diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
>new file mode 100644
>index 0000000000..1a6add80da
>--- /dev/null
>+++ b/libavcodec/riscv/idctdsp_init.c
>@@ -0,0 +1,41 @@
>+/*
>+ * Copyright © 2022 Rémi Denis-Courmont.
>+ *
>+ * This file is part of FFmpeg.
>+ *
>+ * FFmpeg is free software; you can redistribute it and/or
>+ * modify it under the terms of the GNU Lesser General Public
>+ * License as published by the Free Software Foundation; either
>+ * version 2.1 of the License, or (at your option) any later version.
>+ *
>+ * FFmpeg is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>+ * Lesser General Public License for more details.
>+ *
>+ * You should have received a copy of the GNU Lesser General Public
>+ * License along with FFmpeg; if not, write to the Free Software
>+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>+ */
>+
>+#include <stdint.h>
>+
>+#include "libavutil/attributes.h"
>+#include "libavutil/cpu.h"
>+#include "libavutil/riscv/cpu.h"
>+#include "libavcodec/avcodec.h"
>+#include "libavcodec/idctdsp.h"
>+
>+void ff_put_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
>+ ptrdiff_t stride);
>+
>+av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
>+ unsigned high_bit_depth)
>+{
>+#if HAVE_RVV
>+ int flags = av_get_cpu_flags();
>+
>+ if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16)
>+ c->put_pixels_clamped = ff_put_pixels_clamped_rvv;
>+#endif
>+}
>diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S
>new file mode 100644
>index 0000000000..a59edd0a83
>--- /dev/null
>+++ b/libavcodec/riscv/idctdsp_rvv.S
>@@ -0,0 +1,43 @@
>+/*
>+ * Copyright © 2022 Rémi Denis-Courmont.
>+ *
>+ * This file is part of FFmpeg.
>+ *
>+ * FFmpeg is free software; you can redistribute it and/or
>+ * modify it under the terms of the GNU Lesser General Public
>+ * License as published by the Free Software Foundation; either
>+ * version 2.1 of the License, or (at your option) any later version.
>+ *
>+ * FFmpeg is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>+ * Lesser General Public License for more details.
>+ *
>+ * You should have received a copy of the GNU Lesser General Public
>+ * License along with FFmpeg; if not, write to the Free Software
>+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>+ */
>+
>+#include "config.h"
>+#include "../libavutil/riscv/asm.S"
>+
>+func ff_put_pixels_clamped_rvv, zve32x
>+ vsetivli zero, 8, e16, m1, ta, ma
>+ vlseg8e16.v v24, (a0)
>+ /* RVV only has signed-signed and unsigned-unsigned clipping.
>+ * We need two steps for signed-to-unsigned clipping. */
>+ vsetvli t0, zero, e16, m8, ta, ma
>+ vmax.vx v24, v24, zero
>+
>+ vsetivli zero, 8, e8, mf2, ta, ma
>+ vnclipu.wi v16, v24, 0
>+ vnclipu.wi v17, v25, 0
>+ vnclipu.wi v18, v26, 0
>+ vnclipu.wi v19, v27, 0
>+ vnclipu.wi v20, v28, 0
>+ vnclipu.wi v21, v29, 0
>+ vnclipu.wi v22, v30, 0
>+ vnclipu.wi v23, v31, 0
>+ vssseg8e8.v v16, (a1), a2
>+ ret
>+endfunc
>--
>2.37.2
>
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel@ffmpeg.org
>https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>To unsubscribe, visit link above, or email
>ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
This seems to have the same include path problem as Martin noticed (can't test right now).
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function
2022-09-28 8:06 ` Rémi Denis-Courmont
@ 2022-09-28 9:48 ` Lynne
0 siblings, 0 replies; 13+ messages in thread
From: Lynne @ 2022-09-28 9:48 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Sep 28, 2022, 10:06 by remi@remlab.net:
> Le 27 septembre 2022 23:04:22 GMT+03:00, remi@remlab.net a écrit :
> >From: Rémi Denis-Courmont <remi@remlab.net>
>
>>
>>
> >---
>
>> libavcodec/idctdsp.c | 2 ++
>> libavcodec/idctdsp.h | 2 ++
>> libavcodec/riscv/Makefile | 2 ++
>> libavcodec/riscv/idctdsp_init.c | 41 +++++++++++++++++++++++++++++++
>> libavcodec/riscv/idctdsp_rvv.S | 43 +++++++++++++++++++++++++++++++++
>> 5 files changed, 90 insertions(+)
>> create mode 100644 libavcodec/riscv/idctdsp_init.c
>> create mode 100644 libavcodec/riscv/idctdsp_rvv.S
>>
> >diff --git a/libavcodec/idctdsp.c b/libavcodec/idctdsp.c
> >index 9035003b72..4ee9c3aa74 100644
> >--- a/libavcodec/idctdsp.c
> >+++ b/libavcodec/idctdsp.c
> >@@ -312,6 +312,8 @@ av_cold void ff_idctdsp_init(IDCTDSPContext *c, AVCodecContext *avctx)
>
>> ff_idctdsp_init_arm(c, avctx, high_bit_depth);
>> #elif ARCH_PPC
>> ff_idctdsp_init_ppc(c, avctx, high_bit_depth);
>>
> >+#elif ARCH_RISCV
> >+ ff_idctdsp_init_riscv(c, avctx, high_bit_depth);
>
>> #elif ARCH_X86
>> ff_idctdsp_init_x86(c, avctx, high_bit_depth);
>> #elif ARCH_MIPS
>>
> >diff --git a/libavcodec/idctdsp.h b/libavcodec/idctdsp.h
> >index e8f20acaf2..2bd9820f72 100644
> >--- a/libavcodec/idctdsp.h
> >+++ b/libavcodec/idctdsp.h
> >@@ -114,6 +114,8 @@ void ff_idctdsp_init_arm(IDCTDSPContext *c, AVCodecContext *avctx,
>
>> unsigned high_bit_depth);
>> void ff_idctdsp_init_ppc(IDCTDSPContext *c, AVCodecContext *avctx,
>> unsigned high_bit_depth);
>>
> >+void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
> >+ unsigned high_bit_depth);
>
>> void ff_idctdsp_init_x86(IDCTDSPContext *c, AVCodecContext *avctx,
>> unsigned high_bit_depth);
>> void ff_idctdsp_init_mips(IDCTDSPContext *c, AVCodecContext *avctx,
>>
> >diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
> >index 829a1823d2..96925afdab 100644
> >--- a/libavcodec/riscv/Makefile
> >+++ b/libavcodec/riscv/Makefile
> >@@ -5,6 +5,8 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
>
>> RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o
>> OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o
>> RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o
>>
> >+OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_init.o
> >+RVV-OBJS-$(CONFIG_IDCTDSP) += riscv/idctdsp_rvv.o
>
>> OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \
>> riscv/pixblockdsp_rvi.o
>> OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o
>>
> >diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c
> >new file mode 100644
> >index 0000000000..1a6add80da
> >--- /dev/null
> >+++ b/libavcodec/riscv/idctdsp_init.c
> >@@ -0,0 +1,41 @@
> >+/*
> >+ * Copyright © 2022 Rémi Denis-Courmont.
> >+ *
> >+ * This file is part of FFmpeg.
> >+ *
> >+ * FFmpeg is free software; you can redistribute it and/or
> >+ * modify it under the terms of the GNU Lesser General Public
> >+ * License as published by the Free Software Foundation; either
> >+ * version 2.1 of the License, or (at your option) any later version.
> >+ *
> >+ * FFmpeg is distributed in the hope that it will be useful,
> >+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> >+ * Lesser General Public License for more details.
> >+ *
> >+ * You should have received a copy of the GNU Lesser General Public
> >+ * License along with FFmpeg; if not, write to the Free Software
> >+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> >+ */
> >+
> >+#include <stdint.h>
> >+
> >+#include "libavutil/attributes.h"
> >+#include "libavutil/cpu.h"
> >+#include "libavutil/riscv/cpu.h"
> >+#include "libavcodec/avcodec.h"
> >+#include "libavcodec/idctdsp.h"
> >+
> >+void ff_put_pixels_clamped_rvv(const int16_t *block, uint8_t *pixels,
> >+ ptrdiff_t stride);
> >+
> >+av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx,
> >+ unsigned high_bit_depth)
> >+{
> >+#if HAVE_RVV
> >+ int flags = av_get_cpu_flags();
> >+
> >+ if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16)
> >+ c->put_pixels_clamped = ff_put_pixels_clamped_rvv;
> >+#endif
> >+}
> >diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S
> >new file mode 100644
> >index 0000000000..a59edd0a83
> >--- /dev/null
> >+++ b/libavcodec/riscv/idctdsp_rvv.S
> >@@ -0,0 +1,43 @@
> >+/*
> >+ * Copyright © 2022 Rémi Denis-Courmont.
> >+ *
> >+ * This file is part of FFmpeg.
> >+ *
> >+ * FFmpeg is free software; you can redistribute it and/or
> >+ * modify it under the terms of the GNU Lesser General Public
> >+ * License as published by the Free Software Foundation; either
> >+ * version 2.1 of the License, or (at your option) any later version.
> >+ *
> >+ * FFmpeg is distributed in the hope that it will be useful,
> >+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> >+ * Lesser General Public License for more details.
> >+ *
> >+ * You should have received a copy of the GNU Lesser General Public
> >+ * License along with FFmpeg; if not, write to the Free Software
> >+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> >+ */
> >+
> >+#include "config.h"
> >+#include "../libavutil/riscv/asm.S"
> >+
> >+func ff_put_pixels_clamped_rvv, zve32x
> >+ vsetivli zero, 8, e16, m1, ta, ma
> >+ vlseg8e16.v v24, (a0)
> >+ /* RVV only has signed-signed and unsigned-unsigned clipping.
> >+ * We need two steps for signed-to-unsigned clipping. */
> >+ vsetvli t0, zero, e16, m8, ta, ma
> >+ vmax.vx v24, v24, zero
> >+
> >+ vsetivli zero, 8, e8, mf2, ta, ma
> >+ vnclipu.wi v16, v24, 0
> >+ vnclipu.wi v17, v25, 0
> >+ vnclipu.wi v18, v26, 0
> >+ vnclipu.wi v19, v27, 0
> >+ vnclipu.wi v20, v28, 0
> >+ vnclipu.wi v21, v29, 0
> >+ vnclipu.wi v22, v30, 0
> >+ vnclipu.wi v23, v31, 0
> >+ vssseg8e8.v v16, (a1), a2
> >+ ret
> >+endfunc
> >--
> >2.37.2
>
>>
>>
> >_______________________________________________
> >ffmpeg-devel mailing list
> >ffmpeg-devel@ffmpeg.org
> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>>
>>
> >To unsubscribe, visit link above, or email
> >ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
> This seems to have the same include path problem as Martin noticed (can't test right now).
>
Pushed with the same fix wbs applied.
Thanks.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-09-28 9:48 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 20:04 [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Rémi Denis-Courmont
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 1/7] lavu/riscv: helper to read the vector length remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 2/7] lavc/idctdsp: RISC-V V put_pixels_clamped function remi
2022-09-28 8:06 ` Rémi Denis-Courmont
2022-09-28 9:48 ` Lynne
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 3/7] lavc/idctdsp: RISC-V V add_pixels_clamped function remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 4/7] lavc/idctdsp: RISC-V V put_signed_pixels_clamped function remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 5/7] lavc/pixblockdsp: RISC-V V 8-bit get_pixels & get_pixels_unaligned remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 6/7] lavc/pixblockdsp: RISC-V V 16-bit " remi
2022-09-27 20:04 ` [FFmpeg-devel] [PATCH 7/7] lavc/pixblockdsp: RISC-V diff_pixels & diff_pixels_unaligned remi
2022-09-27 21:32 ` [FFmpeg-devel] [RFC] [PATCH 0/7] RISC-V V vector length dealings Lynne
2022-09-28 6:03 ` Rémi Denis-Courmont
2022-09-28 6:51 ` Lynne
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git