* [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp
@ 2023-06-15 10:36 Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min Peiting Shen
` (6 more replies)
0 siblings, 7 replies; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
We optimized the six interfaces of AC3 init by RVV, the optimized
performance was tested on the RISC-V ISA simulator--Spike, and the
results were attached to each commit.
shenpeiting (6):
lavc/ac3dsp: RISC-V V ac3_exponent_min
lavc/ac3dsp: RISC-V V float_to_fixed24
lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float
lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size
lavc/ac3dsp: RISC-V B ac3_extract_exponents
libavcodec/ac3dsp.c | 2 +
libavcodec/ac3dsp.h | 1 +
libavcodec/riscv/Makefile | 3 +
libavcodec/riscv/ac3dsp_init.c | 60 +++++++++
libavcodec/riscv/ac3dsp_rvb.S | 42 ++++++
libavcodec/riscv/ac3dsp_rvv.S | 225 +++++++++++++++++++++++++++++++++
6 files changed, 333 insertions(+)
create mode 100644 libavcodec/riscv/ac3dsp_init.c
create mode 100644 libavcodec/riscv/ac3dsp_rvb.S
create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
@ 2023-06-15 10:36 ` Peiting Shen
2023-06-15 18:02 ` Rémi Denis-Courmont
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 2/6] lavc/ac3dsp: RISC-V V float_to_fixed24 Peiting Shen
` (5 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
Find scalar minium optimized by using RVV instructions
Benchmarks on Spike(cycles):
*exp=1280*4;num_reuse_blocks=5;nb_coefs=16
ac3_exponent_min_c: 1993
ac3_exponent_min_rvv: 258
*exp=1280*4;num_reuse_blocks=19;nb_coefs=255
ac3_exponent_min_c: 99010
ac3_exponent_min_rvv: 3843
The optimization performance is more obvious with the increase of number of
reuse blocks and number of coefs.
Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
---
libavcodec/ac3dsp.c | 2 ++
libavcodec/ac3dsp.h | 1 +
libavcodec/riscv/Makefile | 2 ++
libavcodec/riscv/ac3dsp_init.c | 37 +++++++++++++++++++++++++++
libavcodec/riscv/ac3dsp_rvv.S | 46 ++++++++++++++++++++++++++++++++++
5 files changed, 88 insertions(+)
create mode 100644 libavcodec/riscv/ac3dsp_init.c
create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
index 22cb5f242e..302b786b15 100644
--- a/libavcodec/ac3dsp.c
+++ b/libavcodec/ac3dsp.c
@@ -395,5 +395,7 @@ av_cold void ff_ac3dsp_init(AC3DSPContext *c)
ff_ac3dsp_init_x86(c);
#elif ARCH_MIPS
ff_ac3dsp_init_mips(c);
+#elif ARCH_RISCV
+ ff_ac3dsp_init_riscv(c);
#endif
}
diff --git a/libavcodec/ac3dsp.h b/libavcodec/ac3dsp.h
index 33e51e202e..a01bff3d11 100644
--- a/libavcodec/ac3dsp.h
+++ b/libavcodec/ac3dsp.h
@@ -109,6 +109,7 @@ void ff_ac3dsp_init (AC3DSPContext *c);
void ff_ac3dsp_init_arm(AC3DSPContext *c);
void ff_ac3dsp_init_x86(AC3DSPContext *c);
void ff_ac3dsp_init_mips(AC3DSPContext *c);
+void ff_ac3dsp_init_riscv(AC3DSPContext *c);
void ff_ac3dsp_downmix(AC3DSPContext *c, float **samples, float **matrix,
int out_ch, int in_ch, int len);
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index ee17a521fd..a627924cac 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -1,5 +1,7 @@
OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o
RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o
+OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_init.o
+RVV-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvv.o
OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o
RVV-OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_rvv.o
OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
new file mode 100644
index 0000000000..bb67d86998
--- /dev/null
+++ b/libavcodec/riscv/ac3dsp_init.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+#include <stdint.h>
+
+#include "libavutil/attributes.h"
+#include "libavcodec/ac3dsp.h"
+#include "libavutil/cpu.h"
+#include "config.h"
+
+void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int nb_coefs);
+
+av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
+{
+ int flags = av_get_cpu_flags();
+#if HAVE_RVV
+ if (flags & AV_CPU_FLAG_RVV_I32)
+ c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
+#endif
+}
+
diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
new file mode 100644
index 0000000000..879123f4a7
--- /dev/null
+++ b/libavcodec/riscv/ac3dsp_rvv.S
@@ -0,0 +1,46 @@
+/*
+ * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/riscv/asm.S"
+
+func ff_ac3_exponent_min_rvv, zve32x
+ beq a1, x0, 3f
+ li t0, 256
+ addi a1, a1, 1
+1:
+ mv t2, a0
+ mv t3, a1
+ lb t4, (t2)
+2:
+ vsetvli t1, t3, e8, m8
+ vlse8.v v0, (t2), t0
+ vmv.s.x v8, t4
+ sub t3, t3, t1
+ vredminu.vs v8, v0, v8
+ vmv.x.s t4, v8
+ bnez t3, 2b
+ vsetivli t1, 1, e8
+ vse8.v v8, (a0)
+ addi a0, a0, 1
+ addi a2, a2, -1
+ bnez a2, 1b
+3:
+ ret
+endfunc
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* [FFmpeg-devel] [PATCH 2/6] lavc/ac3dsp: RISC-V V float_to_fixed24
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min Peiting Shen
@ 2023-06-15 10:36 ` Peiting Shen
2023-06-15 18:06 ` Rémi Denis-Courmont
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32 Peiting Shen
` (4 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
Vector instructions replaces scalar options of float convert to fixed
Benchmarks on Spike(cycles):
len=16
float_to_fixed24_c: 315
float_to_fixed24_rvv: 27
len=160
float_to_fixed24_c: 2871
float_to_fixed24_rvv: 67
Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
---
libavcodec/riscv/ac3dsp_init.c | 5 ++++-
libavcodec/riscv/ac3dsp_rvv.S | 19 +++++++++++++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
index bb67d86998..a4e75a7541 100644
--- a/libavcodec/riscv/ac3dsp_init.c
+++ b/libavcodec/riscv/ac3dsp_init.c
@@ -25,13 +25,16 @@
#include "config.h"
void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int nb_coefs);
+void ff_float_to_fixed24_rvv(int32_t *dst, const float *src, unsigned int len);
av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
{
int flags = av_get_cpu_flags();
#if HAVE_RVV
- if (flags & AV_CPU_FLAG_RVV_I32)
+ if (flags & AV_CPU_FLAG_RVV_I32) {
c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
+ c->float_to_fixed24 = ff_float_to_fixed24_rvv;
+ }
#endif
}
diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
index 879123f4a7..d98e72c12c 100644
--- a/libavcodec/riscv/ac3dsp_rvv.S
+++ b/libavcodec/riscv/ac3dsp_rvv.S
@@ -44,3 +44,22 @@ func ff_ac3_exponent_min_rvv, zve32x
3:
ret
endfunc
+
+
+func ff_float_to_fixed24_rvv, zve32x
+ addi t1, x0, 1
+ slli t1, t1, 24
+ fcvt.s.w f1, t1
+1:
+ vsetvli t0, a2, e32, m8
+ vle32.v v0, (a1)
+ vfmul.vf v0, v0, f1
+ vfcvt.x.f.v v16, v0
+ vse32.v v16, (a0)
+ sub a2, a2, t0
+ slli t0, t0, 2
+ add a1, a1, t0
+ add a0, a0, t0
+ bgtz a2, 1b
+ ret
+endfunc
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 2/6] lavc/ac3dsp: RISC-V V float_to_fixed24 Peiting Shen
@ 2023-06-15 10:36 ` Peiting Shen
2023-06-15 19:25 ` Rémi Denis-Courmont
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 4/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float Peiting Shen
` (3 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
Scalar calculating int32 sum_square optimized by using RVV instructions
Benchmarks on Spike(cycles):
len=128
ac3_sum_square_butterfly_int32_c: 8497
ac3_sum_square_butterfly_int32_rvv: 258
len=1280
ac3_sum_square_butterfly_int32_c: 84529
ac3_sum_square_butterfly_int32_rvv: 2274
Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
---
libavcodec/riscv/ac3dsp_init.c | 8 +++++
libavcodec/riscv/ac3dsp_rvv.S | 53 ++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
index a4e75a7541..4fd4abe83e 100644
--- a/libavcodec/riscv/ac3dsp_init.c
+++ b/libavcodec/riscv/ac3dsp_init.c
@@ -26,6 +26,10 @@
void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int nb_coefs);
void ff_float_to_fixed24_rvv(int32_t *dst, const float *src, unsigned int len);
+void ff_ac3_sum_square_butterfly_int32_rvv(int64_t sum[4],
+ const int32_t *coef0,
+ const int32_t *coef1,
+ int len);
av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
{
@@ -35,6 +39,10 @@ av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
c->float_to_fixed24 = ff_float_to_fixed24_rvv;
}
+#if (__riscv_xlen >= 64)
+ if (flags & AV_CPU_FLAG_RVV_I64)
+ c->sum_square_butterfly_int32 = ff_ac3_sum_square_butterfly_int32_rvv;
+#endif
#endif
}
diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
index d98e72c12c..4e0d238f85 100644
--- a/libavcodec/riscv/ac3dsp_rvv.S
+++ b/libavcodec/riscv/ac3dsp_rvv.S
@@ -63,3 +63,56 @@ func ff_float_to_fixed24_rvv, zve32x
bgtz a2, 1b
ret
endfunc
+
+
+func ff_ac3_sum_square_butterfly_int32_rvv, zve64x
+ vsetvli t0, a3, e32, m2
+ vle32.v v0, (a1)
+ vle32.v v2, (a2)
+ vadd.vv v4, v0, v2
+ vsub.vv v6, v0, v2
+ vwmul.vv v8, v0, v0
+ vwmul.vv v12, v2, v2
+ vwmul.vv v16, v4, v4
+ vwmul.vv v20, v6, v6
+ sub a3, a3, t0
+ slli t0, t0, 2
+ add a1, a1, t0
+ add a2, a2, t0
+ beq a3, x0, 2f
+1:
+ vsetvli t0, a3, e32, m2
+ vle32.v v0, (a1)
+ vle32.v v2, (a2)
+ vadd.vv v4, v0, v2
+ vsub.vv v6, v0, v2
+ vwmacc.vv v8, v0, v0
+ vwmacc.vv v12, v2, v2
+ vwmacc.vv v16, v4, v4
+ vwmacc.vv v20, v6, v6
+ sub a3, a3, t0
+ slli t0, t0, 2
+ add a1, a1, t0
+ add a2, a2, t0
+ bnez a3, 1b
+2:
+ vsetvli t0, x0, e64, m4
+ vmv.s.x v24, x0
+ vmv.s.x v25, x0
+ vmv.s.x v26, x0
+ vmv.s.x v27, x0
+ vredsum.vs v24, v8, v24
+ vredsum.vs v25, v12, v25
+ vredsum.vs v26, v16, v26
+ vredsum.vs v27, v20, v27
+ vsetivli t0, 1, e64, m1
+ vse64.v v24, (a0)
+ addi a0, a0, 8
+ vse64.v v25, (a0)
+ addi a0, a0, 8
+ vse64.v v26, (a0)
+ addi a0, a0, 8
+ vse64.v v27, (a0)
+ addi a0, a0, 8
+ ret
+endfunc
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* [FFmpeg-devel] [PATCH 4/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
` (2 preceding siblings ...)
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32 Peiting Shen
@ 2023-06-15 10:36 ` Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 5/6] lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size Peiting Shen
` (2 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
Scalar calculating float sum_square optimized by using RVV instructions
Benchmarks on Spike(cycles):
len=128
ac3_sum_square_butterfly_float_c: 7986
ac3_sum_square_butterfly_float_rvv: 146
len=1280
ac3_sum_square_butterfly_float_c: 79410
ac3_sum_square_butterfly_float_rvv: 1154
Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
---
libavcodec/riscv/ac3dsp_init.c | 6 ++++
libavcodec/riscv/ac3dsp_rvv.S | 54 ++++++++++++++++++++++++++++++++++
2 files changed, 60 insertions(+)
diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
index 4fd4abe83e..d3aa20623a 100644
--- a/libavcodec/riscv/ac3dsp_init.c
+++ b/libavcodec/riscv/ac3dsp_init.c
@@ -30,6 +30,10 @@ void ff_ac3_sum_square_butterfly_int32_rvv(int64_t sum[4],
const int32_t *coef0,
const int32_t *coef1,
int len);
+void ff_ac3_sum_square_butterfly_float_rvv(float sum[4],
+ const float *coef0,
+ const float *coef1,
+ int len);
av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
{
@@ -39,6 +43,8 @@ av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
c->float_to_fixed24 = ff_float_to_fixed24_rvv;
}
+ if (flags & AV_CPU_FLAG_RVV_F32)
+ c->sum_square_butterfly_float = ff_ac3_sum_square_butterfly_float_rvv;
#if (__riscv_xlen >= 64)
if (flags & AV_CPU_FLAG_RVV_I64)
c->sum_square_butterfly_int32 = ff_ac3_sum_square_butterfly_int32_rvv;
diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
index 4e0d238f85..05a4d44938 100644
--- a/libavcodec/riscv/ac3dsp_rvv.S
+++ b/libavcodec/riscv/ac3dsp_rvv.S
@@ -116,3 +116,57 @@ func ff_ac3_sum_square_butterfly_int32_rvv, zve64x
addi a0, a0, 8
ret
endfunc
+
+
+func ff_ac3_sum_square_butterfly_float_rvv, zve32f
+ #Round Up
+ li t1, 0x61
+ fscsr t1
+ vsetvli t0, a3, e32, m4
+ vle32.v v0, (a1)
+ vle32.v v4, (a2)
+ vfadd.vv v8, v0, v4
+ vfsub.vv v12, v0, v4
+ vfmul.vv v16, v0, v0
+ vfmul.vv v20, v4, v4
+ vfmul.vv v24, v8, v8
+ vfmul.vv v28, v12, v12
+ sub a3, a3, t0
+ slli t0, t0, 2
+ add a1, a1, t0
+ add a2, a2, t0
+ beq a3, x0, 2f
+1:
+ vsetvli t0, a3, e32, m4
+ vle32.v v0, (a1)
+ vle32.v v4, (a2)
+ vfadd.vv v8, v0, v4
+ vfsub.vv v12, v0, v4
+ vfmacc.vv v16, v0, v0
+ vfmacc.vv v20, v4, v4
+ vfmacc.vv v24, v8, v8
+ vfmacc.vv v28, v12, v12
+ sub a3, a3, t0
+ slli t0, t0, 2
+ add a1, a1, t0
+ add a2, a2, t0
+ bnez a3, 1b
+2:
+ vsetvli t0, x0, e32, m4
+ fcvt.s.w f0, x0
+ vfmv.v.f v0, f0
+ vfredsum.vs v0, v16, v0
+ vfredsum.vs v1, v20, v1
+ vfredsum.vs v2, v24, v2
+ vfredsum.vs v3, v28, v3
+ vsetivli t0, 1, e32, m1
+ vse32.v v0, (a0)
+ addi a0, a0, 4
+ vse32.v v1, (a0)
+ addi a0, a0, 4
+ vse32.v v2, (a0)
+ addi a0, a0, 4
+ vse32.v v3, (a0)
+ addi a0, a0, 4
+ ret
+endfunc
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* [FFmpeg-devel] [PATCH 5/6] lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
` (3 preceding siblings ...)
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 4/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float Peiting Shen
@ 2023-06-15 10:36 ` Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 6/6] lavc/ac3dsp: RISC-V B ac3_extract_exponents Peiting Shen
2023-06-15 13:57 ` [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Lynne
6 siblings, 0 replies; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
Use RVV instruction vlseg<nf>e<eew> to operate on matrix columns.
Benchmarks on Spike(cycles):
ac3_compute_mantissa_size_c: 2338
ac3_compute_mantissa_size_rvv: 55
Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
---
libavcodec/riscv/ac3dsp_init.c | 3 ++
libavcodec/riscv/ac3dsp_rvv.S | 53 ++++++++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)
diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
index d3aa20623a..4769213ebc 100644
--- a/libavcodec/riscv/ac3dsp_init.c
+++ b/libavcodec/riscv/ac3dsp_init.c
@@ -35,6 +35,8 @@ void ff_ac3_sum_square_butterfly_float_rvv(float sum[4],
const float *coef1,
int len);
+void ff_ac3_compute_mantissa_size_rvv(uint16_t mant_cnt[6][16]);
+
av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
{
int flags = av_get_cpu_flags();
@@ -42,6 +44,7 @@ av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
if (flags & AV_CPU_FLAG_RVV_I32) {
c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
c->float_to_fixed24 = ff_float_to_fixed24_rvv;
+ c->compute_mantissa_size = ff_ac3_compute_mantissa_size_rvv;
}
if (flags & AV_CPU_FLAG_RVV_F32)
c->sum_square_butterfly_float = ff_ac3_sum_square_butterfly_float_rvv;
diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
index 05a4d44938..cedd3d7d05 100644
--- a/libavcodec/riscv/ac3dsp_rvv.S
+++ b/libavcodec/riscv/ac3dsp_rvv.S
@@ -170,3 +170,56 @@ func ff_ac3_sum_square_butterfly_float_rvv, zve32f
addi a0, a0, 4
ret
endfunc
+
+
+func ff_ac3_compute_mantissa_size_rvv, zve32x
+ li t1, 32
+ li t2, 3
+ vsetivli t0, 6, e16
+ vlsseg5e16.v v0, (a0), t1
+ #(clolum[[i]1]/3)
+ vdivu.vx v1, v1, t2
+ li t3, 5
+ vwmul.vx v22, v1, t3
+ #(clolum[[i]2]/3)
+ vdivu.vx v2, v2, t2
+ vwmacc.vx v22, t2, v3
+ vsra.vi v4, v4, 1
+ vadd.vv v4, v4, v2
+ li t2, 7
+ vwmacc.vx v22, t2, v4
+
+ addi a0, a0, 10
+ vlsseg8e16.v v5, (a0), t1
+ li t3, 4
+ vwmacc.vx v22, t3, v5
+ li t3, 5
+ vwmacc.vx v22, t3, v6
+ li t3, 6
+ vwmacc.vx v22, t3, v7
+ li t3, 7
+ vwmacc.vx v22, t3, v8
+ li t3, 8
+ vwmacc.vx v22, t3, v9
+ li t3, 9
+ vwmacc.vx v22, t3, v10
+ li t3, 10
+ vwmacc.vx v22, t3, v11
+ li t3, 11
+ vwmacc.vx v22, t3, v12
+
+ addi a0, a0, 16
+ vlsseg3e16.v v5, (a0), t1
+ li t3, 12
+ vwmacc.vx v22, t3, v5
+ li t3, 14
+ vwmacc.vx v22, t3, v6
+ li t3, 16
+ vwmacc.vx v22, t3, v7
+
+ vsetivli t0, 6, e32, m2
+ vmv.s.x v30, x0
+ vredsum.vs v30, v22, v30
+ vmv.x.s a0, v30
+ ret
+endfunc
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* [FFmpeg-devel] [PATCH 6/6] lavc/ac3dsp: RISC-V B ac3_extract_exponents
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
` (4 preceding siblings ...)
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 5/6] lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size Peiting Shen
@ 2023-06-15 10:36 ` Peiting Shen
2023-06-15 19:18 ` Rémi Denis-Courmont
2023-06-15 13:57 ` [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Lynne
6 siblings, 1 reply; 14+ messages in thread
From: Peiting Shen @ 2023-06-15 10:36 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
From: Shen Peiting <shenpeiting@eswincomputing.com>
Use RVB instruction clz to calculate the number of leading zeros of MSB instead of av_log2.
Benchmarks on Spike(cycles):
ac3_extract_exponents_c: 8226
ac3_extract_exponents_rvb: 1167
Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
---
libavcodec/riscv/Makefile | 3 ++-
libavcodec/riscv/ac3dsp_init.c | 3 +++
libavcodec/riscv/ac3dsp_rvb.S | 42 ++++++++++++++++++++++++++++++++++
3 files changed, 47 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/riscv/ac3dsp_rvb.S
diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
index a627924cac..3d0c196cb9 100644
--- a/libavcodec/riscv/Makefile
+++ b/libavcodec/riscv/Makefile
@@ -1,7 +1,8 @@
OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o
RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o
OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_init.o
-RVV-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvv.o
+RVV-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvv.o \
+ riscv/ac3dsp_rvb.o
OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o
RVV-OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_rvv.o
OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
index 4769213ebc..75cd3c7e11 100644
--- a/libavcodec/riscv/ac3dsp_init.c
+++ b/libavcodec/riscv/ac3dsp_init.c
@@ -26,6 +26,7 @@
void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int nb_coefs);
void ff_float_to_fixed24_rvv(int32_t *dst, const float *src, unsigned int len);
+void ff_ac3_extract_exponents_rvb(uint8_t *exp, int32_t *coef, int nb_coefs);
void ff_ac3_sum_square_butterfly_int32_rvv(int64_t sum[4],
const int32_t *coef0,
const int32_t *coef1,
@@ -40,6 +41,8 @@ void ff_ac3_compute_mantissa_size_rvv(uint16_t mant_cnt[6][16]);
av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
{
int flags = av_get_cpu_flags();
+ if (flags & AV_CPU_FLAG_RVB_BASIC)
+ c->extract_exponents = ff_ac3_extract_exponents_rvb;
#if HAVE_RVV
if (flags & AV_CPU_FLAG_RVV_I32) {
c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
diff --git a/libavcodec/riscv/ac3dsp_rvb.S b/libavcodec/riscv/ac3dsp_rvb.S
new file mode 100644
index 0000000000..3bf24c7392
--- /dev/null
+++ b/libavcodec/riscv/ac3dsp_rvb.S
@@ -0,0 +1,42 @@
+/*
+ * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+#include "libavutil/riscv/asm.S"
+
+func ff_ac3_extract_exponents_rvb, zbb
+ li t1, __riscv_xlen - 24
+1:
+ lw t0, (a1)
+ bgez t0, 2f
+ neg t0, t0
+
+2:
+ clz t4, t0
+ sub t4, t4, t1
+ sb t4,(a0)
+ addi a2, a2, -1
+ addi a1, a1, 4
+ addi a0, a0, 1
+
+ bgtz a2, 1b
+
+ ret
+endfunc
\ No newline at end of file
--
2.17.1
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
` (5 preceding siblings ...)
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 6/6] lavc/ac3dsp: RISC-V B ac3_extract_exponents Peiting Shen
@ 2023-06-15 13:57 ` Lynne
2023-06-15 19:10 ` Rémi Denis-Courmont
6 siblings, 1 reply; 14+ messages in thread
From: Lynne @ 2023-06-15 13:57 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Jun 15, 2023, 12:37 by shenpeiting@eswincomputing.com:
> From: Shen Peiting <shenpeiting@eswincomputing.com>
>
> We optimized the six interfaces of AC3 init by RVV, the optimized
> performance was tested on the RISC-V ISA simulator--Spike, and the
> results were attached to each commit.
>
> shenpeiting (6):
> lavc/ac3dsp: RISC-V V ac3_exponent_min
> lavc/ac3dsp: RISC-V V float_to_fixed24
> lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
> lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float
> lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size
> lavc/ac3dsp: RISC-V B ac3_extract_exponents
>
> libavcodec/ac3dsp.c | 2 +
> libavcodec/ac3dsp.h | 1 +
> libavcodec/riscv/Makefile | 3 +
> libavcodec/riscv/ac3dsp_init.c | 60 +++++++++
> libavcodec/riscv/ac3dsp_rvb.S | 42 ++++++
> libavcodec/riscv/ac3dsp_rvv.S | 225 +++++++++++++++++++++++++++++++++
> 6 files changed, 333 insertions(+)
> create mode 100644 libavcodec/riscv/ac3dsp_init.c
> create mode 100644 libavcodec/riscv/ac3dsp_rvb.S
> create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
>
Could you implement checkasm for this? It shouldn't
be more than a hundred lines, and there are examples,
tests/checkasm/aacpsdsp.c being the most similar.
Since CPUs with the needed extensions aren't released,
we're not doing any FATE runs, and so if the results don't
match the C version, we'll end up with broken code once
they do exist. And no one wants to debug someone else's
assembly.
Those results look far too optimistic, and I'm guessing
it's because they're using a theoretical huge vector size
limit. Could you re-test with something more realistic,
like 256-bit vectors, using checkasm --bench?
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min Peiting Shen
@ 2023-06-15 18:02 ` Rémi Denis-Courmont
0 siblings, 0 replies; 14+ messages in thread
From: Rémi Denis-Courmont @ 2023-06-15 18:02 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
Nihao
Le torstaina 15. kesäkuuta 2023, 13.36.40 EEST Peiting Shen a écrit :
> From: Shen Peiting <shenpeiting@eswincomputing.com>
>
> Find scalar minium optimized by using RVV instructions
>
> Benchmarks on Spike(cycles):
> *exp=1280*4;num_reuse_blocks=5;nb_coefs=16
> ac3_exponent_min_c: 1993
> ac3_exponent_min_rvv: 258
> *exp=1280*4;num_reuse_blocks=19;nb_coefs=255
> ac3_exponent_min_c: 99010
> ac3_exponent_min_rvv: 3843
>
> The optimization performance is more obvious with the increase of number of
> reuse blocks and number of coefs.
>
> Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
> Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
> Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
> Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
> ---
> libavcodec/ac3dsp.c | 2 ++
> libavcodec/ac3dsp.h | 1 +
> libavcodec/riscv/Makefile | 2 ++
> libavcodec/riscv/ac3dsp_init.c | 37 +++++++++++++++++++++++++++
> libavcodec/riscv/ac3dsp_rvv.S | 46 ++++++++++++++++++++++++++++++++++
> 5 files changed, 88 insertions(+)
> create mode 100644 libavcodec/riscv/ac3dsp_init.c
> create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
>
> diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c
> index 22cb5f242e..302b786b15 100644
> --- a/libavcodec/ac3dsp.c
> +++ b/libavcodec/ac3dsp.c
> @@ -395,5 +395,7 @@ av_cold void ff_ac3dsp_init(AC3DSPContext *c)
> ff_ac3dsp_init_x86(c);
> #elif ARCH_MIPS
> ff_ac3dsp_init_mips(c);
> +#elif ARCH_RISCV
> + ff_ac3dsp_init_riscv(c);
> #endif
> }
> diff --git a/libavcodec/ac3dsp.h b/libavcodec/ac3dsp.h
> index 33e51e202e..a01bff3d11 100644
> --- a/libavcodec/ac3dsp.h
> +++ b/libavcodec/ac3dsp.h
> @@ -109,6 +109,7 @@ void ff_ac3dsp_init (AC3DSPContext *c);
> void ff_ac3dsp_init_arm(AC3DSPContext *c);
> void ff_ac3dsp_init_x86(AC3DSPContext *c);
> void ff_ac3dsp_init_mips(AC3DSPContext *c);
> +void ff_ac3dsp_init_riscv(AC3DSPContext *c);
>
> void ff_ac3dsp_downmix(AC3DSPContext *c, float **samples, float **matrix,
> int out_ch, int in_ch, int len);
> diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile
> index ee17a521fd..a627924cac 100644
> --- a/libavcodec/riscv/Makefile
> +++ b/libavcodec/riscv/Makefile
> @@ -1,5 +1,7 @@
> OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o
> RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o
> +OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_init.o
> +RVV-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvv.o
> OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o
> RVV-OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_rvv.o
> OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \
> diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
> new file mode 100644
> index 0000000000..bb67d86998
> --- /dev/null
> +++ b/libavcodec/riscv/ac3dsp_init.c
> @@ -0,0 +1,37 @@
> +/*
> + * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd.
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
> USA + */
> +#include <stdint.h>
> +
> +#include "libavutil/attributes.h"
> +#include "libavcodec/ac3dsp.h"
> +#include "libavutil/cpu.h"
> +#include "config.h"
> +
> +void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int
> nb_coefs); +
> +av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
> +{
> + int flags = av_get_cpu_flags();
> +#if HAVE_RVV
> + if (flags & AV_CPU_FLAG_RVV_I32)
> + c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
> +#endif
> +}
> +
> diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
> new file mode 100644
> index 0000000000..879123f4a7
> --- /dev/null
> +++ b/libavcodec/riscv/ac3dsp_rvv.S
> @@ -0,0 +1,46 @@
> +/*
> + * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd.
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
> USA + */
> +
> +#include "libavutil/riscv/asm.S"
> +
> +func ff_ac3_exponent_min_rvv, zve32x
> + beq a1, x0, 3f
Conventionally, we use ABI names for GP and FP registers like almost everybody
else and their moms in RISC-V world. So that would be `zero`.
But in this case, you should use the `beqz` alias anyway.
> + li t0, 256
> + addi a1, a1, 1
> +1:
> + mv t2, a0
AFAICT, t2 is always the same as a0, and thus this is unnecessary.
> + mv t3, a1
> + lb t4, (t2)
> +2:
> + vsetvli t1, t3, e8, m8
> + vlse8.v v0, (t2), t0
> + vmv.s.x v8, t4
> + sub t3, t3, t1
> + vredminu.vs v8, v0, v8
> + vmv.x.s t4, v8
> + bnez t3, 2b
> + vsetivli t1, 1, e8
When you're not using the output, so use zero.
But you don't even need to reset the vector configuration here. Just use
masking to store the one element (you could also transfer to scalar and store,
but that's probably slower than masking).
> + vse8.v v8, (a0)
> + addi a0, a0, 1
> + addi a2, a2, -1
This will stall on an in-order CPU. Please avoid immediately consecutive
interdependent instructions.
> + bnez a2, 1b
> +3:
> + ret
> +endfunc
--
Rémi Denis-Courmont
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 2/6] lavc/ac3dsp: RISC-V V float_to_fixed24
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 2/6] lavc/ac3dsp: RISC-V V float_to_fixed24 Peiting Shen
@ 2023-06-15 18:06 ` Rémi Denis-Courmont
0 siblings, 0 replies; 14+ messages in thread
From: Rémi Denis-Courmont @ 2023-06-15 18:06 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
Le torstaina 15. kesäkuuta 2023, 13.36.41 EEST Peiting Shen a écrit :
> From: Shen Peiting <shenpeiting@eswincomputing.com>
>
> Vector instructions replaces scalar options of float convert to fixed
>
> Benchmarks on Spike(cycles):
> len=16
> float_to_fixed24_c: 315
> float_to_fixed24_rvv: 27
> len=160
> float_to_fixed24_c: 2871
> float_to_fixed24_rvv: 67
>
> Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
> Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
> Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
> Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
> ---
> libavcodec/riscv/ac3dsp_init.c | 5 ++++-
> libavcodec/riscv/ac3dsp_rvv.S | 19 +++++++++++++++++++
> 2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
> index bb67d86998..a4e75a7541 100644
> --- a/libavcodec/riscv/ac3dsp_init.c
> +++ b/libavcodec/riscv/ac3dsp_init.c
> @@ -25,13 +25,16 @@
> #include "config.h"
>
> void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int
> nb_coefs); +void ff_float_to_fixed24_rvv(int32_t *dst, const float *src,
> unsigned int len);
>
> av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
> {
> int flags = av_get_cpu_flags();
> #if HAVE_RVV
> - if (flags & AV_CPU_FLAG_RVV_I32)
> + if (flags & AV_CPU_FLAG_RVV_I32) {
> c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
> + c->float_to_fixed24 = ff_float_to_fixed24_rvv;
> + }
> #endif
> }
>
> diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
> index 879123f4a7..d98e72c12c 100644
> --- a/libavcodec/riscv/ac3dsp_rvv.S
> +++ b/libavcodec/riscv/ac3dsp_rvv.S
> @@ -44,3 +44,22 @@ func ff_ac3_exponent_min_rvv, zve32x
> 3:
> ret
> endfunc
> +
> +
> +func ff_float_to_fixed24_rvv, zve32x
> + addi t1, x0, 1
That's `li t1, 1` please.
> + slli t1, t1, 24
> + fcvt.s.w f1, t1
Please use ABI names for FPRs, e.g. `ft0`. Nobody wants to have to remember
which ones are callee-saved and which ones aren't.
> +1:
> + vsetvli t0, a2, e32, m8
> + vle32.v v0, (a1)
> + vfmul.vf v0, v0, f1
> + vfcvt.x.f.v v16, v0
> + vse32.v v16, (a0)
> + sub a2, a2, t0
> + slli t0, t0, 2
> + add a1, a1, t0
> + add a0, a0, t0
Use sh2add to save one in three instruction here.
And please interleave scalar and vector instructions so in-order CPU can
potentially multi-issue.
> + bgtz a2, 1b
> + ret
> +endfunc
--
Реми Дёни-Курмон
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp
2023-06-15 13:57 ` [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Lynne
@ 2023-06-15 19:10 ` Rémi Denis-Courmont
0 siblings, 0 replies; 14+ messages in thread
From: Rémi Denis-Courmont @ 2023-06-15 19:10 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Le torstaina 15. kesäkuuta 2023, 16.57.18 EEST Lynne a écrit :
> Jun 15, 2023, 12:37 by shenpeiting@eswincomputing.com:
> > From: Shen Peiting <shenpeiting@eswincomputing.com>
> >
> > We optimized the six interfaces of AC3 init by RVV, the optimized
> > performance was tested on the RISC-V ISA simulator--Spike, and the
> > results were attached to each commit.
> >
> > shenpeiting (6):
> > lavc/ac3dsp: RISC-V V ac3_exponent_min
> > lavc/ac3dsp: RISC-V V float_to_fixed24
> > lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
> > lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float
> > lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size
> > lavc/ac3dsp: RISC-V B ac3_extract_exponents
> >
> > libavcodec/ac3dsp.c | 2 +
> > libavcodec/ac3dsp.h | 1 +
> > libavcodec/riscv/Makefile | 3 +
> > libavcodec/riscv/ac3dsp_init.c | 60 +++++++++
> > libavcodec/riscv/ac3dsp_rvb.S | 42 ++++++
> > libavcodec/riscv/ac3dsp_rvv.S | 225 +++++++++++++++++++++++++++++++++
> > 6 files changed, 333 insertions(+)
> > create mode 100644 libavcodec/riscv/ac3dsp_init.c
> > create mode 100644 libavcodec/riscv/ac3dsp_rvb.S
> > create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
>
> Could you implement checkasm for this? It shouldn't
> be more than a hundred lines, and there are examples,
> tests/checkasm/aacpsdsp.c being the most similar.
> Since CPUs with the needed extensions aren't released,
> we're not doing any FATE runs,
Well... I accept hardware donations (with regular USB-C power supply and
passive cooling) to back what would be the third generation of RISC-V FATE
instances.
Until R-V-V 1.0 hardware production substitutes unobtainium for silicium, I
also accept Lichee Pi4A or equivalent hardware bundles, which would be able to
run most (but definitely not all) of FFmpeg's RVV functions with a sizable
amount of kludging.
> and so if the results don't
> match the C version, we'll end up with broken code once
> they do exist. And no one wants to debug someone else's
> assembly.
>
> Those results look far too optimistic, and I'm guessing
> it's because they're using a theoretical huge vector size
> limit. Could you re-test with something more realistic,
> like 256-bit vectors, using checkasm --bench?
It could also be that Spike counts everything as one cycle, regardless of the
group multipler, not (just) the vector size.
--
Rémi Denis-Courmont
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 6/6] lavc/ac3dsp: RISC-V B ac3_extract_exponents
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 6/6] lavc/ac3dsp: RISC-V B ac3_extract_exponents Peiting Shen
@ 2023-06-15 19:18 ` Rémi Denis-Courmont
0 siblings, 0 replies; 14+ messages in thread
From: Rémi Denis-Courmont @ 2023-06-15 19:18 UTC (permalink / raw)
To: ffmpeg-devel
Le torstaina 15. kesäkuuta 2023, 13.36.45 EEST Peiting Shen a écrit :
> From: Shen Peiting <shenpeiting@eswincomputing.com>
>
> Use RVB instruction clz to calculate the number of leading zeros of MSB
> instead of av_log2.
>
> Benchmarks on Spike(cycles):
> ac3_extract_exponents_c: 8226
> ac3_extract_exponents_rvb: 1167
FWIW, RV-Zbb can be benchmarked on real hardware.
I would have done it already if only there was a checkasm case for this.
--
Rémi Denis-Courmont
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32 Peiting Shen
@ 2023-06-15 19:25 ` Rémi Denis-Courmont
2023-06-16 10:15 ` 沈佩婷
0 siblings, 1 reply; 14+ messages in thread
From: Rémi Denis-Courmont @ 2023-06-15 19:25 UTC (permalink / raw)
To: ffmpeg-devel; +Cc: Shen Peiting
Le torstaina 15. kesäkuuta 2023, 13.36.42 EEST Peiting Shen a écrit :
> From: Shen Peiting <shenpeiting@eswincomputing.com>
>
> Scalar calculating int32 sum_square optimized by using RVV instructions
>
> Benchmarks on Spike(cycles):
> len=128
> ac3_sum_square_butterfly_int32_c: 8497
> ac3_sum_square_butterfly_int32_rvv: 258
> len=1280
> ac3_sum_square_butterfly_int32_c: 84529
> ac3_sum_square_butterfly_int32_rvv: 2274
>
> Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
> Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
> Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
> Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
> ---
> libavcodec/riscv/ac3dsp_init.c | 8 +++++
> libavcodec/riscv/ac3dsp_rvv.S | 53 ++++++++++++++++++++++++++++++++++
> 2 files changed, 61 insertions(+)
>
> diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
> index a4e75a7541..4fd4abe83e 100644
> --- a/libavcodec/riscv/ac3dsp_init.c
> +++ b/libavcodec/riscv/ac3dsp_init.c
> @@ -26,6 +26,10 @@
>
> void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int
> nb_coefs); void ff_float_to_fixed24_rvv(int32_t *dst, const float *src,
> unsigned int len); +void ff_ac3_sum_square_butterfly_int32_rvv(int64_t
> sum[4],
> + const int32_t *coef0,
> + const int32_t *coef1,
> + int len);
>
> av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
> {
> @@ -35,6 +39,10 @@ av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
> c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
> c->float_to_fixed24 = ff_float_to_fixed24_rvv;
> }
> +#if (__riscv_xlen >= 64)
> + if (flags & AV_CPU_FLAG_RVV_I64)
> + c->sum_square_butterfly_int32 =
> ff_ac3_sum_square_butterfly_int32_rvv; +#endif
> #endif
> }
>
> diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
> index d98e72c12c..4e0d238f85 100644
> --- a/libavcodec/riscv/ac3dsp_rvv.S
> +++ b/libavcodec/riscv/ac3dsp_rvv.S
> @@ -63,3 +63,56 @@ func ff_float_to_fixed24_rvv, zve32x
> bgtz a2, 1b
> ret
> endfunc
> +
> +
> +func ff_ac3_sum_square_butterfly_int32_rvv, zve64x
> + vsetvli t0, a3, e32, m2
> + vle32.v v0, (a1)
> + vle32.v v2, (a2)
> + vadd.vv v4, v0, v2
> + vsub.vv v6, v0, v2
> + vwmul.vv v8, v0, v0
> + vwmul.vv v12, v2, v2
> + vwmul.vv v16, v4, v4
> + vwmul.vv v20, v6, v6
> + sub a3, a3, t0
> + slli t0, t0, 2
> + add a1, a1, t0
> + add a2, a2, t0
> + beq a3, x0, 2f
> +1:
> + vsetvli t0, a3, e32, m2
> + vle32.v v0, (a1)
> + vle32.v v2, (a2)
> + vadd.vv v4, v0, v2
> + vsub.vv v6, v0, v2
> + vwmacc.vv v8, v0, v0
> + vwmacc.vv v12, v2, v2
> + vwmacc.vv v16, v4, v4
> + vwmacc.vv v20, v6, v6
> + sub a3, a3, t0
> + slli t0, t0, 2
> + add a1, a1, t0
> + add a2, a2, t0
> + bnez a3, 1b
> +2:
> + vsetvli t0, x0, e64, m4
> + vmv.s.x v24, x0
> + vmv.s.x v25, x0
> + vmv.s.x v26, x0
> + vmv.s.x v27, x0
> + vredsum.vs v24, v8, v24
> + vredsum.vs v25, v12, v25
> + vredsum.vs v26, v16, v26
> + vredsum.vs v27, v20, v27
As far as I can tell this is a reserved encoding (c.f. RVV 1.0 §3.4.2), and I
believe that QEMU throws an Illegal instruction in this case. (I would check
but there are no checkasm test case for this function.) Does this actual work
on your simulator? Because if so, then your simulator is probably broken/
buggy.
> + vsetivli t0, 1, e64, m1
> + vse64.v v24, (a0)
> + addi a0, a0, 8
> + vse64.v v25, (a0)
> + addi a0, a0, 8
> + vse64.v v26, (a0)
> + addi a0, a0, 8
> + vse64.v v27, (a0)
> + addi a0, a0, 8
> + ret
> +endfunc
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
2023-06-15 19:25 ` Rémi Denis-Courmont
@ 2023-06-16 10:15 ` 沈佩婷
0 siblings, 0 replies; 14+ messages in thread
From: 沈佩婷 @ 2023-06-16 10:15 UTC (permalink / raw)
To: FFmpeg development discussions and patches
Hei,
> -----原始邮件-----发件人:"Rémi Denis-Courmont" <remi@remlab.net>发送时间:2023-06-16 03:25:07 (星期五)收件人:ffmpeg-devel@ffmpeg.org抄送:"Shen Peiting" <shenpeiting@eswincomputing.com>主题:Re: [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
>
> Le torstaina 15. kesäkuuta 2023, 13.36.42 EEST Peiting Shen a écrit :
> > From: Shen Peiting <shenpeiting@eswincomputing.com>
> >
> > Scalar calculating int32 sum_square optimized by using RVV instructions
> >
> > Benchmarks on Spike(cycles):
> > len=128
> > ac3_sum_square_butterfly_int32_c: 8497
> > ac3_sum_square_butterfly_int32_rvv: 258
> > len=1280
> > ac3_sum_square_butterfly_int32_c: 84529
> > ac3_sum_square_butterfly_int32_rvv: 2274
> >
> > Co-Authored by: Yang Xiaojun <yangxiaojun@eswincomputing.com>
> > Co-Authored by: Huang Xing <huangxing1@eswincomputing.com>
> > Co-Authored by: Zeng Fanchen <zengfanchen@eswincomputing.com>
> > Signed-off-by: Shen Peiting <shenpeiting@eswincomputing.com>
> > ---
> > libavcodec/riscv/ac3dsp_init.c | 8 +++++
> > libavcodec/riscv/ac3dsp_rvv.S | 53 ++++++++++++++++++++++++++++++++++
> > 2 files changed, 61 insertions(+)
> >
> > diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c
> > index a4e75a7541..4fd4abe83e 100644
> > --- a/libavcodec/riscv/ac3dsp_init.c
> > +++ b/libavcodec/riscv/ac3dsp_init.c
> > @@ -26,6 +26,10 @@
> >
> > void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int
> > nb_coefs); void ff_float_to_fixed24_rvv(int32_t *dst, const float *src,
> > unsigned int len); +void ff_ac3_sum_square_butterfly_int32_rvv(int64_t
> > sum[4],
> > + const int32_t *coef0,
> > + const int32_t *coef1,
> > + int len);
> >
> > av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
> > {
> > @@ -35,6 +39,10 @@ av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c)
> > c->ac3_exponent_min = ff_ac3_exponent_min_rvv;
> > c->float_to_fixed24 = ff_float_to_fixed24_rvv;
> > }
> > +#if (__riscv_xlen >= 64)
> > + if (flags & AV_CPU_FLAG_RVV_I64)
> > + c->sum_square_butterfly_int32 =
> > ff_ac3_sum_square_butterfly_int32_rvv; +#endif
> > #endif
> > }
> >
> > diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S
> > index d98e72c12c..4e0d238f85 100644
> > --- a/libavcodec/riscv/ac3dsp_rvv.S
> > +++ b/libavcodec/riscv/ac3dsp_rvv.S
> > @@ -63,3 +63,56 @@ func ff_float_to_fixed24_rvv, zve32x
> > bgtz a2, 1b
> > ret
> > endfunc
> > +
> > +
> > +func ff_ac3_sum_square_butterfly_int32_rvv, zve64x
> > + vsetvli t0, a3, e32, m2
> > + vle32.v v0, (a1)
> > + vle32.v v2, (a2)
> > + vadd.vv v4, v0, v2
> > + vsub.vv v6, v0, v2
> > + vwmul.vv v8, v0, v0
> > + vwmul.vv v12, v2, v2
> > + vwmul.vv v16, v4, v4
> > + vwmul.vv v20, v6, v6
> > + sub a3, a3, t0
> > + slli t0, t0, 2
> > + add a1, a1, t0
> > + add a2, a2, t0
> > + beq a3, x0, 2f
> > +1:
> > + vsetvli t0, a3, e32, m2
> > + vle32.v v0, (a1)
> > + vle32.v v2, (a2)
> > + vadd.vv v4, v0, v2
> > + vsub.vv v6, v0, v2
> > + vwmacc.vv v8, v0, v0
> > + vwmacc.vv v12, v2, v2
> > + vwmacc.vv v16, v4, v4
> > + vwmacc.vv v20, v6, v6
> > + sub a3, a3, t0
> > + slli t0, t0, 2
> > + add a1, a1, t0
> > + add a2, a2, t0
> > + bnez a3, 1b
> > +2:
> > + vsetvli t0, x0, e64, m4
> > + vmv.s.x v24, x0
> > + vmv.s.x v25, x0
> > + vmv.s.x v26, x0
> > + vmv.s.x v27, x0
> > + vredsum.vs v24, v8, v24
> > + vredsum.vs v25, v12, v25
> > + vredsum.vs v26, v16, v26
> > + vredsum.vs v27, v20, v27
>
> As far as I can tell this is a reserved encoding (c.f. RVV 1.0 §3.4.2), and I
> believe that QEMU throws an Illegal instruction in this case. (I would check
> but there are no checkasm test case for this function.) Does this actual work
> on your simulator? Because if so, then your simulator is probably broken/
> buggy.
>
RVV 1.0 §14
Vector reduction operations take a vector register group of elements and a scalar held in
element 0 of a vector register, and perform a reduction using some binary operator, to produce
a scalar result in element 0 of a vector register. The scalar input and output operands
are held in element 0 of a single vector register, not a vector register group, so any vector
register can be the scalar source or destination of a vector reduction regardless of LMUL setting.
RVV 1.0 §16.1. Integer Scalar Move Instructions
The integer scalar read/write instructions transfer a single value between a scalar x register and
element 0 of a vector register. The instructions ignore LMUL and vector register groups.
According to the above, I think this coding is legal.
Actually, we have passed all the fate tests on the qemu 6.0.0,compiled riscv-unknown-linux-gnu-gcc 13.0.1, configuration as
./configure --enable-cross-compile --cross-prefix=riscv64-unknown-linux-gnu- --arch=riscv
--extra-cflags="-march=rv64imafdcbv -mabi=lp64d --static -I/home/user/code/iconv/iconv-riscv/include"
--prefix=ffshare --extra-libs="-static -liconv" --extra-ldflags="-L/home/user/code/iconv/iconv-riscv/lib"
--target-os=linux --target-exec="qemu-riscv64 -cpu rv64,x-v=true,x-b=true,x-zpn=true,x-zbpbo=true,x-zpsfoperand=true,x-arith=true"
--enable-gpl --enable-memory-poisoning
We will modify the non-standard coding mentioned in emails, and complete the checkasm code in patch v2
> > + vsetivli t0, 1, e64, m1
> > + vse64.v v24, (a0)
> > + addi a0, a0, 8
> > + vse64.v v25, (a0)
> > + addi a0, a0, 8
> > + vse64.v v26, (a0)
> > + addi a0, a0, 8
> > + vse64.v v27, (a0)
> > + addi a0, a0, 8
> > + ret
> > +endfunc
>
>
> --
> 雷米‧德尼-库尔蒙
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2023-06-16 10:15 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-15 10:36 [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min Peiting Shen
2023-06-15 18:02 ` Rémi Denis-Courmont
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 2/6] lavc/ac3dsp: RISC-V V float_to_fixed24 Peiting Shen
2023-06-15 18:06 ` Rémi Denis-Courmont
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 3/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32 Peiting Shen
2023-06-15 19:25 ` Rémi Denis-Courmont
2023-06-16 10:15 ` 沈佩婷
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 4/6] lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 5/6] lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size Peiting Shen
2023-06-15 10:36 ` [FFmpeg-devel] [PATCH 6/6] lavc/ac3dsp: RISC-V B ac3_extract_exponents Peiting Shen
2023-06-15 19:18 ` Rémi Denis-Courmont
2023-06-15 13:57 ` [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp Lynne
2023-06-15 19:10 ` Rémi Denis-Courmont
Git Inbox Mirror of the ffmpeg-devel mailing list - see https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://master.gitmailbox.com/ffmpegdev/0 ffmpegdev/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 ffmpegdev ffmpegdev/ https://master.gitmailbox.com/ffmpegdev \
ffmpegdev@gitmailbox.com
public-inbox-index ffmpegdev
Example config snippet for mirrors.
AGPL code for this site: git clone https://public-inbox.org/public-inbox.git